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STATISTICIANS—TODAY AND TOMORROW 
PRESIDENTIAL ADDRESS 


Wa ter E. Hoan ey, Jr, 
Armstrong Cork Company 


Delivered at Annual Meetings of the American Statistical Association, 8:00 
p.mM., Monday, December 29, 1958, Congress Hotel, Chicago, Illinois. 


ET me say at the outset that this has been a very interesting and worthwhile 
year for me as President of your association. I’ve particularly enjoyed the 
opportunity to widen my acquaintance considerably among statisticians not 
only in this country but in Canada and many sections of the world. When the 
year began I really thought that I had a fairly good understanding of who 
statisticians were and what they did—but very frankly I’ve learned a lot. My 
experiences this year have made me much more conscious of the tremendous 
breadth of the field of statistics and the seemingly almost endless diversity of 
our professional interests and applications. At the moment, I’d be hard pressed 
to mention a field of science, business, agriculture, labor, or government in 
which statisticians are not active or statistics not gaining greater acceptance 
and use. 

In the Business and Economics Statistics field with which I am most familiar, 
I am particularly conscious of the major forward strides which statisticians and 
statistics have been making. Yet, our colleagues with other specialized statis- 
tical interests would have no trouble demonstrating, I’m sure, that parallel or 
even greater gains are being achieved elsewhere in our profession. I need only 
to mention the fields of weather; physics, including astro-physics and rocket 
research ; legal evidence; biology and medicine; and engineering, including test- 
ing of materials and location of factories. Automatic data processing certainly 
received its most important original boost from statisticians. 

As rather conclusive evidence of the rising interest in statistics, the member- 
ship of our association has increased substantially this year. At the beginning 
of the year we had 5,667 members, and today the membership numbers over 
6,400, an advance of more than 10 per cent. This increase is particularly sig- 
nificant in light of the economic recession which has prevailed this past year. 

No one can say with any real precision just how many statisticians there are 
in the United States today. A great deal of the difficulty of determining this 
number lies in the definition of “statistician.” Whereas only a few thousand 
men and women possibly identify themselves as “statisticians,” for example, 
on income tax returns, hundreds of thousands use widely varying statistical 
techniques every day in the course of their work. Doctorates in statistics 
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granted to date in this country—which leads the world in statistical usage— 
would number less than 500, with only 27 granted in the 1956-57 college year. 
But more and more doctorates are being granted in statistics than ever before 
and many doctorates in other fields now include requirements in statistics 
(e.g., my own doctorate in economics). The very small number of doctorates 
in statistics points to the present basic scarcity of statisticians. 

Each successful demonstration of how statistical techniques can be used to 
solve important problems has whetted the appetite not only of statisticians but 
especially policy-makers whose interest and support can have a profound effect 
upon the future of our profession. 

One of the most significant developments in recent years has been the ac- 
ceptance by the general public of—indeed the demand by citizens at large for— 
more statistical information. At times it seems hard to realize that many widely 
used current statistical measures and techniques have had their origin within 
the past two decades, and many older ones have become sufficiently well 
understood that they can be mentioned without explicit definition in the press 
and over radio and TV. I have in mind such measures as gross national product, 
labor force, personal income, weather developments; polio, cancer, heart and 
other health statistics; agricultural crops and forecasts; accidents; births and 
deaths; metropolitan and suburban growth; school needs; and election and 
sports statistics. 

I don’t mean to imply that statistics of all types have become a featured 
attraction for daily living—far from it. What I do mean is that the level of 
literacy and general understanding has advanced so much that it is possible to 
arouse interest in numerous subjects these days by statistically gathered and 
appraised facts as well as emotions. 

The new break-through in statistics, of course, has been aided immeasurably 
by marked advances in electronic computers of many sizes. High speed elec- 
tronic equipment now makes possible statistical inquiries which previously lay 
beyond human practicability. For several years it seemed to be more fashion- 
able than economical to have a huge “electronic brain” installed in many opera- 
tions. However, the “shake-down” era is nearing an end. Tangible cost saving 
results are beginning to be achieved, particularly in business. The statisticians’ 
tools have improved immensely in recent years, increasing not only the ef- 
ficiency and effectiveness of statisticians at work, but vastly enlarging the 
opportunities for statistical inquiry. 

As a corporation treasurer, I am well aware of the premium which the in- 
vestment specialists place on “growth” companies. These latter are defined 
generally as companies which have demonstrated ability to expand faster and 
more profitably than the industries of which they are a part and the national 
economy as well. It seems clear to me that the “growth” label can be properly 
affixed to the field of statistics and to the expanding opportunities for statis- 
ticians generally. Statistical needs and applications are expanding on all sides; 
more individuals are employed on a full or part-time basis to use statistical 
techniques than ever before; much improved equipment is available to get the 
job done more swiftly and accurately, and extensive research is under way to 
find still better methods and machines. 
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Yes, we can take a great deal of satisfaction in the growth of our profession 
in recent years. We can be proud of the status and achievements of statisticians 
today. We can look forward confidently to still greater accomplishments over 
the many tomorrows ahead. 

But, before we allow ourselves to become too satisfied as members of the 
statistics profession, shouldn’t we also take stock of the problems or roadblocks 
to be overcome before tomorrow’s potentialities can be fully realized? Frankly, 
such problems and roadblocks are not hard to find. In fact, an impartial ob- 
server might even conclude that all is not nearly as well with statistics and 
statisticians as might appear from the encouraging growth story which has just 
been presented. 

There is one general and continuing problem confronting our profession, 
largely because of the diversity and varying degrees of interest in statistics. It 
concerns the scope and effecti¥eness of publications. Since this problem already 
is getting a great deal of official attention, I do not plan to discuss it specifically 
here. Our competent publications policy committee and editors are carefully 
studying this entire matter and further progress in publications can be ex- 
pected. 

At least five other major problems, however, confront the statistics profession 
and hence in many respects the American Statistical Association as well: 

(1) A dangerous cleavage between so-called mathematical or abstract statis- 

ticians and so-called non-mathematical or applied statisticians. 

(2) A lack of a comprehensive retraining and refresher plan for practicing 

statisticians. 

(3) An all too frequent failure of statisticians to pursue their work to the 

point of interpreting their findings and aiding policy-making. 

(4) The absence of a vigorous program to win broad support for better 

statistics in public and private life. 

(5) A strong tendency toward splintering of interests into specialized sta- 

tistical fields. 

Let us consider each of these problems and some suggestions for their solution. 


. CLEAVAGE BETWEEN MATHEMATICAL AND NON-MATHEMATICAL STATISTICIANS 


All statisticians, of course, use some mathematics but there is quite a differ- 
ence these days in the extent to which advanced or higher mathematical 
techniques are employed. Many statisticians trained before World War II— 
who incidentally now often hold some of the highest professional and policy- 
making positions—shy away from describing themselves as statisticians. They 
feel some inadequacy about their command of mathematics and frequently be- 
lieve they no longer qualify as members of the statistical profession. 

It is becoming increasingly evident that there is a substantial difference in 
the degree of mathematical preparation among practicing statisticians. While 
this has always been true, the difference has now become increasingly significant 
because it provides the basis for a growing cleavage among our membership. 
Those who speak a language largely of mathematical symbols and formulae 
appear to have less and less in common with those who employ more simplified 
techniques and expressions, and vice versa. 
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How often have you heard members of our association—not to mention 
members of many other related professional associations—observe that an 
article or discussion is either “too cluttered up with mathematical equations” 
or “too superficial and general” to be of interest. Such comments are overt 
symptoms of the cleavage between statisticians who greatly emphasize mathe- 
matical techniques and those who take a much less rigorous mathematical 
approach. Unfortunately, this cleavage often becomes so deep-seated that many 
statisticians find themselves literally forced to take sides, i.e., to classify them- 
selves as members of either the higher-mathematical or non-higher-mathemati- 
cal camp. The final step is often withdrawal of individuals and groups from 
general activities and discussions at national, regional, and chapter meetings 
into tightly knit groups or organizations evidencing pride in their command of, 
or contempt for, mathematics. 

Perhaps I’ve overdrawn this point a bit, but I have a great deal of personally 
compiled evidence that this cleavage problem is not just imagination. 

All this might seem amusing if it were not so far-reaching and potentially 
dangerous for all of us. Continuation of this cleavage trend not only could fore- 
shadow disaster for our association, but could threaten to undermine the ability 
of the statistical profession to serve many important fields of endeavor by 
spreading disunity and dissension. 

In my judgment, this cleavage problem is the natural outcome of professional 
advancement. It arises principally because of the refinements made in statis- 
tical techniques resulting from wider application of rigorous mathematical 
disciplines in solving statistical problems. The growing use of electronic com- 
puters and related high-speed equipment has been an additional factor. Those 
who have contributed to or benefited from the mathematical break-through in 
statistics during the past decade or two rightfully and understandably believe 
others are overlooking tremendous opportunities for improved problem solving 
when they ignore or otherwise fail to use the newer applications of mathemati- 
cal techniques to statistical methodology and analysis. Certain mathematical 
statisticians may tend to give the impression that they now have a superior 
standing in the profession since they have a greater command of the latest 
statistical developments which admittedly seem to involve a good deal of 
mathematics. I am sure that such psychological considerations have aggravated 
the cleavage problem. 

Those who cannot or will not admit to being mathematical statisticians 
usually are frank to admit they do not have the mathematical training to 
follow in detail many of the newer statistical techniques and applications. 
Commonly, however, they hasten to add doubts about the necessity or advisa- 
bility of using higher mathematical language in dealing with many statistical 
problems. They bluntly challenge some claims that advanced mathematical 
techniques eliminate judgment biases, by drawing attention to the human 
selection factor in picking assumptions underlying many complex mathematical 
formulae and equations. Nevertheless, these same statisticians pay a good deal 
of tribute to the internal consistency and logical development process to be 
found in the wider use of mathematics in solving statistical problems. 

By now most of you probably will have decided whether you fall into the 
“heavy-on-the-math” or “go-easy-on-the-math” category and have reasoned 
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that I’ve not adequately stated the case for your side. I wonder, however, if 
you wouldn’t all agree that fundamentally this cleavage problem concerns the 
means and not the end of statistics, which is to help analyze and solve important 
problems affecting almost all aspects of human existence. The key objective 
is to get each problem correctly solved—whatever it is—and of far less impor- 
tance is precisely how this is accomplished. If each one of us will keep this 
basic end of statistics in mind, we will have taken a long step forward in settling 
this cleavage problem. But there’s much more to do to insure that more or less 
mathematics is not allowed to cause a further and perhaps far-reaching schism 
in our professional ranks. 

Ideally from at least one viewpoint, the “non-higher-math” group could all 
go back to school and learn or brush up on mathematics. This has some obvious 
practical limitations. I’ll comment a bit later on meeting the retraining and 
refresher needs of statisticians. Suffice it to say here, the wider applications of 
mathematics to statistics and electronic equipment are of such obvious mag- 
nitude that they cannot be lightly dismissed. As a minimum, each non-higher- 
mathematical statistician owes it to himself to evaluate (with the aid of an 
interpreter if necessary) the alleged advances and contributions to statistics 
from more extensive use of mathematics. Not until this has been done can any- 
one rightfully minimize or ignore the increasing role of mathematics in the field 
of statistics. I suspect that the expression “ignorance breeds contempt” may 
have some application in statistics as well as in other fields. 

Now I have a few words for the mathematical statisticians. I have been 
greatly impressed by the contributions to statistics made by more extensive 
use of mathematical disciplines. However, I suspect from checking the litera- 
ture and evaluating discussions I’ve heard or participated in that there is a 
noticeable tendency for some mathematical specialists to become more inter- 
ested in problem solving per se than in using their skills to meet some of the 
practical problems of the day. With mathematics, it’s really so easy at 
times to skip over the troublesome and seemingly minor variables and reach 
conclusions which really don’t solve the policy problem at hand. 

Would you agree again that it is the end and not the means of statistics that 
really counts? No one can object if specialists develop and use a language all 
their own—that is, for a while. But unless this language after a reasonable 
amount of time can redound to the benefit of others—especially in this case, 
those less skilled in mathematics, wouldn’t it be proper to question the time, 
effort, and money spent developing the language? All statisticians, whether 
primarily mathematical or not, need to learn how to express themselves in non- 
technical language. Statistical shorthand is absolutely essential, but when data 
are presented to policymakers and the public they must not be in our shorthand. 

In my judgment, one of our profession’s greatest needs is for a means to get 
the so-called matiiematical and so-called non-mathematical statisticians to sit 
down regularly with each other and explore areas of mutual interest. In most 
instances, the ends will prove to be the same. Why shouldn’t it be possible to 
achieve greater understanding and mutual trust by carefully exploring the ad- 
vantages and disadvantages of particular means? 

It is my recommendation (1) that our association organize a new committee 
on Statistical Techniques and Applications to bring together individuals repre- 





6 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1959 


sentative of the degrees of mathematical training of our membership to in- 
vestigate the full scope of this cleavage problem and to suggest a program to 
minimize it, and (2) that immediately all program committees and publications 
officials in particular cooperate closely with this new committee to help achieve 
better understanding in all areas where this mathematical cleavage problem 
exists. 


2. LACK OF COMPREHENSIVE REFRESHER TRAINING PLAN 


In any field where rapid advances in theory, techniques and applications 
take place, inevitable refresher training problems arise. Teachers, of course, 
are constantly under pressure to “keep up” in their fields and to incorporate 
the newest developments to the fullest extent possible in their own courses. 
While there are inevitable lags between research and practice on the one hand 
and textbooks, articles, and instruction on the other, teachers typically have 
both the professional incentive and opportunity to keep reasonably abreast of 
advances in their given fields if they desire to do so. 

For the individual who is a practicing statistician rather than a sient the 
problem of “keeping up” can be a fearful one. After considerable formal training 
in statistics under recognized authorities, let’s say a statistician starts to prac- 
tice his profession in a private business organization, public utility, labor union, 
or government agency. He brings to his new job a capital fund of knowledge 
and must gain as quickly as possible the practical experience needed to do his 
job well. His work environment, particularly the availability of others with his 
same interests and access to essential literature, plus his own personal deter- 
mination will largely influence whether he keeps abreast of important develop- 
ments. Without regular study, however, any practicing statistician is almost 
certain to find with the passage of time after concluding his formal education 
that he’s drawing heavily upon his capital fund of knowledge and not adding 
regularly to it. How to keep up becomes a fearful problem indeed! 

The problem becomes even more acute for the statistician who advances to 
the point of assuming administrative responsibilities. First, he begins to lose 
touch with day-to-day statistical operations; second, his responsibilities often 
become so broad that he cannot hope to keep up in all the specialized areas 
under his administration ; third, his associations are more and more with execu- 
tive and other administrative people who look to him and his subordinates for 
technical assistance; and fourth, the time available for study seems to diminish 
directly with rising official, professional, civic, family, social and related duties. 

Many of these “administrative” statisticians from time to time have a 
frightening experience when they browse through a copy of a current technical 
journal or read a report from one of their own staff. The material often doesn’t 
seem too familiar and at times isn’t even intelligible. Such administrative 
statisticians may well be doing an outstanding job, but can’t help wonder if 
they wouldn’t be able to contribute more if they had the benefit of greater 
understanding and wider application of newer theories and techniques. 

During the past year this refresher training problem has arisen in many dis- 
cussions. Statisticians in a wide variety of fields and holding positions of vary- 
ing responsibilities have inquired about opportunities for refresher training 
courses. Frequently, the request is for a survey course of trends and develop- 
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ments in statistical theory, techniques, and practice rather than for a course 
devoted to the specific details involved. Others want to learn rudimentary 
mathematics to understand better much of the current statistical literature. 
Still others desire to learn more about a particular technique which they have 
reason to believe would help them in their work. 

It is only natural that statisticians should turn to the American Statistical 
Association for help in keeping up to date in their profession, and I firmly 
believe the ASA should make a strong effort on their behalf. 

The association today offers a great deal in the way of literature, but rela- 
tively little assistance in the specific refresher area. Scores of our members, 
of course, are busily engaged in teaching adult classes; brief regional and local 
conferences have been held; and a few outstanding summer courses are now 
available. But, the ASA really has no formulated national program to meet 
what seems to be a pressing need from many of its members. 

Admittedly successful development of a refresher program will not be easy, 
as some preliminary investigations have revealed, but the results could be very 
rewarding to our members and the association as well. The needs of unnum- 
bered potential enrollees will vary widely; duration and location of refresher 
courses will pose obvious problems; recruitment of a faculty which can conduct 
survey refresher courses and bridge the gaps between theory and practice as 
well as mathematical and non-mathematical approaches may well present some 
unusual difficulties; and adequate financing must be assured. Nevertheless, an 
effective, comprehensive refresher training program definitely seems needed to 
improve the general level of statistical techniques in use, to improve statistical 
understanding, and to win wider membership support for our association. 

Therefore, it is my recommendation that the Board and Council of the 
American Statistical Association (1) request the Section on Training to ap- 
point a committee of outstanding educators and practitioners in the field ef 
statistics to explore carefully and determine as clearly as possible the refresher 
training needs and desires of our membership; and (2) upon establishing the 
magnitude of such needs, plan an appropriate continuing refresher training 
program, to begin no later than during the summer of 1960. 


3. STATISTICIANS, DATA INTERPRETATION AND POLICY-MAKING 


Most people will agree that statistics have their greatest importance as an 
aid to policy-making and the decision process. Better information obviously 
has some value per se, but unless it contributes to improvement of some de- 
vision-making process, it really adds little or nothing to the betterment of 
mankind. 

Policy decisions invariably concern the future. Hence, if statistics and statis- 
tical techniques are to be useful, they too must point to the future. In my 
judgment, statisticians should have a continuing and aggressive interest in data 
interpretation and in forecasting as an aid to policy-making in whatever capac- 
ity and field they serve since presumably they know more about statistics than 
anyone else. 

All too often statisticians tend to shy away from data interpretation and 
forecasting. Many will engage in elaborate and costly statistical research and 
analysis, but frequently when confronted with the task of drawing significant 
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conclusions as well as making recommendations for the future, will shrug their 
shoulders or explain with righteous indignation “that’s not my job.” In these 
instances, statisticians in fact are trying to abdicate any responsibility for the 
ultimate decision to be made. From the policy-maker’s point of view, such an 
attitude neither engenders confidence in the statistician’s work nor in the con- 
tribution and stature of the statistical profession. 

In every field we represent, data interpretation, decision-making and fore- 
casting are unavoidable and inseparable. Frankly, there is no escape from fore- 
casting. Therefore, in my judgment, there is no reason for any of us to shy away 
from it, particularly when as statisticians we have so much potentially to offer 
which will help narrow the degree of error in each forecasting and decision- 
making process. 

Many top-level executives have told me this year that the tragedy of statis- 
ticians so frequently is that they will do an enormous amount of work but stop 
short of the point of making a positive contribution to policy. That contribution 
usually would be a helpful interpretation of the data, a suggestion, or a recom- 
mendation for future action based upon the statistician’s findings. Without 
such positive statement, the policy-maker is likely to feel the statistical in- 
vestigation is not worth too much or, as so commonly happens, will select from 
the mass of statistical data presented those facts and figures which will support 
his own preconceived notions. The net result for the statistician and his pro- 
fession is virtually zero; even worse, by refusing to follow his work to the point 
of stating his own positive interpretation of his findings, the statistician allows 
himself to be a tool of the non-statistical policy-maker who can with prudence 
then use the statistical findings to suit whatever purpose he may have in mind. 

Sometimes as statisticians we like to hold “post mortem” discussions about 
our policy-making superiors. Do we ever rationalize that “if they had only 
taken our advice or asked us—they wouldn’t have made such a mistake?” This 
comment presupposes, of course, that we’ve taken time and felt it was our duty 
before the decision was made to tell the boss what we believe was the proper 
interpretation of our statistical findings and what would be a wise policy based 
upon our statistical investigations. There may be, and usually are, many other 
considerations besides our statistics which must be weighed in a given policy 
decision, but we have no right to be critical unless we’ve carried our statistical 
work to the point of its greatest potential usefulness to policy-makers. 

This plea for greater interest in data interpretation and in policy and de- 
cision making is not in any way intended to weaken a statistician’s objectivity, 
but rather to make his work and professional standing more significant. Ad- 
mittedly, some statistical work may seem actually to be far removed from final 
policy decisions, but I’m convinced most statisticians are closely linked to the 
decision making process if they will only take time to seek clearly the end use 
of their efforts. 

It is my strong recommendation (1) that each member of the ASA reappraise 
the contribution which he or she is making to policy in the field being served 
and seek to enlarge that contribution, and (2) that the ASA give greater em- 
phasis in its entire program to the encouragement of statisticians to make this 
a still more dynamic policy-contributing profession. 
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4. NEED FOR BROAD SUPPORT FOR BETTER STATISTICS 


Despite the striking growth in the acceptance and use of statistics in this 
country and all over the world in recent years, enormous gaps in statistical 
knowledge obviously exist. Appalling deficiencies in the quality of existing sta- 
tistical series persist. These statistical shortcomings are evident in both “public 
policy” statistics and what are sometimes called “market or private policy” 
statistics. Public policy statistics refer to those which are needed by govern- 
ment officials to insure fair and equitable development and administration of 
public laws and policies. Private policy statistics cover those which individual 
organizations of all types require to meet both profit and non-profit objectives. 

In many respects, public policy statistics have now passed from the stage of 
public apathy to mild interest. In the business and economic statistics area, 
however, badly needed improvements in “public policy” statistics have been 
repeatedly voted down, all too often in last-minute adverse decisions, by Con- 
gressional joint House-Senate conference committees. As a result, in many im- 
portant legislative and executive administrative fields, far-reaching decisions— 
affecting vital industries and areas and millions of people—are being made on 
the basis of admittedly weak and misleading statistics. Why this highly un- 
desirable situation should continue is not easily explained. At the root of the 
problem lies the lack of appreciation by many people, including some in high 
government positions, of the importance of having more satisfactory statistical 
information on hand to guide key policy and administrative decisions. Closely 
allied is the failure of most people even directly affected by government policy 
decisions to appreciate the basic weakness in the statistical information used 
in formulating and executing such policies. 

In a great many instances, at least as much progress has been made in recent 
years in improving private policy statistics as public policy statistics. Private 
organizations which have grasped the importance of more and better statistical 
information for proper decision making are now spending substantial and 
growing sums for statistical and allied investigations. In more and more com- 
panies, for example, a major decision won’t be made without a careful ‘survey 
of all relevant trends and developments plus the preparation of detailed fore- 
casts of future developments. 

Statistical expense budgets are rising among many private organizations, 
but aggregate outlays are still quite small and coverage is spotty. Moreover, 
those who insist on having better information to guide their own businesses or 
other activities typically show little or no concern about the need to improve 
public policy statistics which can and do have a profound effect upon their own 
operations. 

It seems obvious that there is a large educational “selling” job to be done 
to obtain far greater grass roots support for better statistical programs. A great 
deal of preliminary work leading to this objective has been done, but it is now 
time for the statistical profession to go on the offensive. Selling is not usually 
considered to be a statistician’s function. I am convinced, nevertheless, that 
our profession must now sharpen its selling skills. This association specifically 
must intensify its efforts to demonstrate convincingly the enormous contribu- 
tion of statistics to modern life, and to publicize widely and in readily under- 
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standable language the opportunities for appropriately trained individuals to 
have interesting and profitable careers as statisticians. As appreciation of the 
need for better statistics grows, interest in and support for statistics by both 
public and private organizations inevitably will increase. 

It is my recommendation that the American Statistical Association (1) create 
a Public Relations committee to develop an educational program to inform the 
public of the growing helpful role of statistics and statisticians in all walks of 
life, and (2) reconsider its long-standing policy against making public com- 
ments on the quality or quantity of available statistics in any field and take a 
more positive stand for more and better statistics, especially in the “public 
policy” area. 


5. SPLINTERING OF STATISTICIANS INTO SPECIALIZED ORGANIZATIONS 

With the recent noteworthy growth of the statistics profession, it is not sur- 
prising that specialized interests have begun to asscrt themselves. At present 
our association has five sections: Biometrics (formed in 1941); training (1944); 
business and economics statistics (1950); social statistics (1953); and physical 
and engineering sciences (1954). Our latest directory shows over 90 per cent of 
our entire membership has expressed an interest in one or more of these sections. 
Specialized interests can be seen in the percentages of total membership giving 
preference to individual sections: training 18 per cent; biometrics 22 per cent; 
physical and engineering sciences 25 per cent; social statistics 30 per cent; and 
business and economics 62 per cent. 

These sections have proved to be highly useful and effective in stimulating 
greater interest among our members, in planning annual meeting programs, 
and in contributing to our publications. Requests for still other sections to be 
chartered within the ASA are to be expected and in my judgment should be 
approved whenever sufficient need can be demonstrated. 

As important as these sections are, however, their very existence reflects a 
number of serious problems for our association and the statistical profession. 
First, there is a marked tendency to de-emphasize the common interests among 
statisticians by stressing their specialized interests; second, the natural cross- 
fertilization processes within the profession are reduced; third, the costs of our 
association as well as allied groups are increased, particularly as requests for 
more and more specialized publications are granted; and fourth, there is ever- 
present threat that specialized groups will withdraw from the parent asso- 
ciation. 

This splintering problem is by no means unique within the statistics pro- 
fession. Actually, I find that it is common to virtually all professional groups 
and organizations. I’ve had several occasions this past year to discuss splinter- 
ing problems informally with top officers of other statistical and related associa- 
tions. Frankly, many are deeply concerned. They see a tremendous prolifera- 
tion of professional statistical organizations—each dedicated to some important 
but nonetheless limited objective, yet all competing to some extent with each 
other. They see the need for more specialized programs as their organizations 
grow, but also observe a resultant sharp rise in operating costs. They note some 
lessening of general interest in the parent association, yet often a growing de- 
pendence upon the parent organization for guidance and financial support. 

I am sure that much good could be gained for all concerned if top-level inter- 
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association contacts and discussions could be pursued officially and regularly. 
This year an important first step has been taken by the American Statistical 
Association and the American Society for Quality Control in the joint sponsor- 
ship of a new statistical publication in the physical and engineering field. Here 
at least is one illustration of a successful effort to bring statisticians closer to- 
gether, and incidentally to save money over the cost of separate publications 
by both organizations. 

To date the American Statistical Association, in my judgment, has been 
strengthened rather than weakened by the chartering of its five sections. It is 
apparent, however, that the splintering trend now merits top-level study and 
attention. , 

Therefore, it is my recommendation that (1) the chairmen of the five ASA 
sections be requested to meet regularly with the Board of Directors of the 
association to assure closer coordination of national and sectional interests and 
activities; (2) the program committee for each annual meeting seek meaningful 
co-sponsorship of a greater number of sessions by two or more sections to 
further understanding and cooperation between sections; and (3) the President 
of the association—pursuant to the authority just granted him by the Board 
and Council—undertake to establish close and continuing relations with the 
chief executive officers of other statistical and related associations to the end 
of fostering better understanding among statisticians and a greater interchange 
of ideas on problems of mutual interest. 


SUMMARY AND CONCLUSIONS 


I fully appreciate that each of the five major problems which has been dis- 
cussed as an actual or potential roadblock to further vigorous growth in the 
statistics field is not going to be solved readily nor as the result of mere sug- 
gestions made in a Presidential address. The answer lies chiefly in a fuller ap- 
preciation of the scope and significance of these problems by statisticians 
generally as well as by those charged with official duties within our association, 
plus a determination by all to take needed action! 

Let me make clear that it will require a good deal of time, effort, planning, 
cooperation, patience, perserverance, and money: (1) to bridge the mathemati- 
cal cleavage existing in our profession; (2) to develop an effective refresher 
training plan for practicing statisticians; (3) to convince statisticians generally 
that they have a responsibility to their profession as well as themselves to 
pursue their work to the point of more positive aid to policy-makers; (4) to go 
on the offensive to win broad support for better statistics in public and private 
life; and (5) to insure that splintering of specialized interests does not under- 
mine the strength of our profession and association. But, I have no doubt that 
the human and financial resources will be found to meet these problems under 
the capable leadership and loyal membership which this association will have in 
coming years. 

As statisticians we all have reason to be proud today—the record of growth 
and accomplishment is impressive. With the enthusiasm and determination now 
evident within this profession and this association, there is strong reason to 
face the future with confidence. For, in my judgment, it is very clear today that 
the statisticians of tomorrow will have a still greater role in shaping destiny. 





SOME SOVIET STATISTICAL BOOKS OF 1957* 


Eseruarp M. FEts 
University of Pittsburgh 


This is a description of the contents and evaluation of Mathematics 
for Economists by Aron Boyarski, several contributions to the Fest- 
schrift in honor of Stanislav Strumilin, Statistical Methods for the Study 
of the Economy by Timon Ryabushkin, and Probability and Information 
by Akiva and Isaak Yaglom, with some additional references to Soviet 
and non-Soviet technical literature. 


F A complete inventory and evaluation of Soviet statistical writings in 1957 

had been intended, the writer could not have achieved it for lack of techni- 
cal competence, available time, and library facilities (relatively good as they 
were). Nor are the selected books thematically from the mainstream of statis- 
tics. The first is a textbook of mathematics for economists with only secondary 
interest in statistical techniques, although with a curious evaluation of their 
applicability and worth. The second is a voluminous collection of essays rang- 
ing from esoteric Marziana and economic historiography to index-number 
theory and what some would call operations research. The third 7s a statistics 
book, but one that is preoccupied with topics which are peripheral to modern 
inferential statistics. The fourth and last is about probability and information 
theory. 

Excluded from consideration was the obviously propagandistic, the repeti- 
tive, and the dull. Writings had to make an interesting point, or develop a 
hitherto neglected theme, or allow glances behind little watched scenes to 
qualify for inclusion. It is felt that, by common scholastic and literary stand- 
ards, the chosen items are well above average. Thus, it is questionable whether 
four—or forty—more books and articles would have changed the general pic- 
ture much, impressionistic as it is. Also, the very shortcomings of the following 
pages may draw attention to the desirability of systematically keeping tab on 
Soviet scientific endeavors, uniformly exciting or not. 


1. BOYARSKI’s “MATHEMATICS FOR ECONOMISTS” 


Mathematics for Economists by Aron Yakovlevich Boyarski'! is probably the 
first Soviet book to resemble the books of similar title by Allen? and Tintner,’ 
and the older ones by Bowley‘ and Evans,’ and it pursues a similar purpose: 
to persuade the non-mathematical economist that he may become a better 
economist if he picks up some mathematical equipment. In coverage it comes 





* An invited review article. Part of the work on this article was done while the author was a member of the 
faculty of the University of California (Berkeley). 

1 Aron Yakovlevich Boyarski, Matematika diya ekonomistov (elementy analiza beskonechno malykh, teoriya 
veroyatnostei i matematicheskoi statistiki), Moskva: Gosudarstvennoe statisticheskoe izdatel’stvo, 1957, 367 pp., 
8 r. 25 k. 

2R. G. D. Allen, Mathematical Analysie for Economists, London: Macmillan, 1938. 

* Gerhard Tintner, Mathematics and Statistics for Econon ists, New York: Rinehart, 1953. 

4 Arthur L. Bowley, The Mathematical Groundwork of Economics, Oxford: Clarendon Press, 1924. 

5 Griffith C. Evans, Mathematical Introduction to E ics, New York: McGraw-Hill, 1930. 
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closest to Tintner’s text in that it includes some statistics—although, as it 
turns out, with a peculiar negative attitude. 

At first the author briefly surveys the role of mathematics in economic anal- 
ysis, beginning with a definition of mathematics: “Mathematics is the science 
of quantitative relations and spatial forms of the real world” (p. 7). Boyarski 
repeatedly emphasizes that, “no matter how deep the intrusion into the realm 
of abstraction,” mathematics is by no means the creation of consciousness but 
rather recognizes completely objective realities. Throughout the book there is 
considerable philosophical ambition, most of it of doubtful relevance to the 
expository purpose; above all Boyarski seems to fear one might get the impres- 
sion that the application of mathematics to economic phenomena inevitably 
traps one in idealism.* Parallelling the discussion that has taken place in the 
Soviet Union on the issue of “classical” vs. “dialectical” logic,’ Boyarski strug- 
gles with the concepts of constant and variable magnitudes with unconvincing 
generalities, since there is no understanding of the notational foundation of the 
distinction. He then distinguishes two kinds of variable magnitudes, discrete 
and continuous. Apparently the intention is to draw the distinction between 
rational and real numbers. The explanation goes, however, in terms of dense- 
ness, and it is overlooked that the rationals are already dense. He then explains 
the concepts of function and graph and coordinate systems. 

The next chapter is devoted to a rather clear and leisurely exposition of the 
following topics: linear functions, some elementary analytic geometry up to 
hyperbolas, arithmetic and harmonic means, geometric progressions (where the 
illustrations are taken from interest-and-capitalization computations), logarith- 
mic scales, polynomials, trigonometric functions, polar coordinates, increments 
and rates of change of functions, differences of various order, linear interpola- 
tion, and, in a loosely intuitive terminology, the concept of limit. When it 
comes to the notion of “infinitely small,” he contents himself with saying that 
“Engels wrote ... with complete justification that molecules etc. may serve 
as the real prototype for the mathematical infinitely small” (p. 84). He is 
aware of the strict falsehood of the natura-non-facit-salium maxim, and for a 
moment he seems to sense the implications of his previous ontological commit- 
ment. But then, quickly switching to a pragmatic argument, he says we often 
obtain practically correct results that apply to a finite world by using calculus 
as if matter were continuous (p. 84). On the other hand, several orders of small- 
ness are introduced as a matter of course, without philosophical afterthoughts. 





* Apparently, leading Soviet workers in the most abstract branches of mathematics are remarkably immune 
from the fear of being branded as idealists and the like. Nikolai N. Luazin, Sobranie Sochinenii, Tom II: Deskrip- 
tivnaya Teoria Mnozhestv (Descriptive Set Theory), Moskva: Izdatel’stvo Akademii Nauk SSSR, 1958 (see especially 
pp. 23-36, 267-269, 464-469, 509-519, 533-536) serenely examines the set-theoretic and foundations-of-mathematics 
tenets of Borel, Lebesgue, Cantor, Hadamar, Baire, Hilbert, Zermelo, Brouwer, Weyl, Sierpitiski, Zhegalkin, and 
others. While advocating his own views, he feels it “ . . . is a question of personal conviction or taste... ” (p. 29) 
which point of departure one prefers. In a prefatory note, P. S. Novikov and L. V. Kaldysh plausibly argue that 
Luazin anticipated some of Kurt Gédel’s ideas of the early ‘thirties. Since the latter is usually considered philo- 
sophically rather a Platonist, such references and comparisons cannot, strictly speaking, be ideology-free. But that 
does not appear to worry the pure mathematicians. 

The view that Luzin is, in a specific sense, a forerunner of Gédel has quite recently been expressed too by 
Waclaw Sierpiriski, Cardinal and Ordinal Numbers, Warszawa: Patistwowe wydawnictwo naukowe, 1958, p. 95. 

7 For excellent detailed reviews of Soviet logical writings see George L. Kline in the Journal of Symbolic Logic, 
Vol. 16 (1951), pp. 46-48; Vol. 17 (1952), pp. 124-129; Vol. 18 (1953), pp. 83-86, 271-272; Vol. 19 (1954), p. 149. 
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Further chapters deal with derivatives. The simplest formulas of differential 
calculus are given, up to and including logarithmic derivation, but on a lower 
level of sophistication than in, say, Allen’s book.* There are, however, a few 
pages on the derivatives of trigonometric functions. 

Now when a function has an economic interpretation in which the adjective 
“total” occurs (total revenue, total cost, etc.), then the first derivative of that 
function, if it exists, is usually called “marginal” this or that (marginal revenue, 
marginal cost, etc.). Indeed, this explains why marginal analysis has long been 
almost synonymous with economic analysis simple and pure; well-worn differ- 
ential calculus was so readily adaptable, no matter how technical or untechni- 
cal the translation. The parallelism of “marginal” and “first derivative” notions 
has been elaborated by no less a figure than Fréchet.* Therefore it is hard in- 
deed to teach differential calculus to economists but avoid illustrations that are 
somehow marginalist in nature. It is hard but Boyarski never quits trying— 
probably to avoid being accused of expounding something which historicaily, 
if not opposed by, was not directly associated with Marxism-Leninism.'° This 
prissiness is one of the main features of the book—and a little comic since there 
is some evidence that it is not really necessary any more.!! 

There are chapters on second and higher derivatives and their interpreta- 
tions. A rapid survey of Taylor and Maclaurin series is included, too. One of 
the better illustrations is in the chapter on minima and maxima: route selection 
in transportation with a view to cost minimization (p. 134). When he comes to 
functions of several variables and partial derivatives, Boyarski finds illustrative 
examples in Marx (p. 144). 

After that the simplest aspects of the method of least squares are explained. 
No mention is made of standard errors of estimate and of standard errors of 
regression coefficients. The “random variation of attributes” and correlation 
are the next topics. On page 178, Boyarski comes close to the original idea of 
Elmer J. Working’s classical paper,” which in a sense started the literature on 
the identifiability-of-structural-parameters theme in econometrics. The (Bra- 
vais-Pearson) coefficient of correlation is introduced as the square root of tle 
ratio of the variance of the residuals to the variance of the regressand variable. 

The more mathematical discussion is resumed with a chapter on the integral 
calculus, with the simplest formulas including the trapezoidal rule for approxi- 
mate integration. Remarkably apt is the way in which Boyarski points out the 





* Cf. note 2, pp. 246 seq. 

* Maurice Fréchet, “Dégager les possibilités et les limites de l’application des sciences mathématiques (et en 
particulier du calcul des probabilités) a l'étude des phénoménes économiques et sociaux,” Revue de l'institut inter- 
national de statistique, La Haye, Vol. 14, Liv. 1/4 (1946) pp. 16-30. 

10 It is of course the opinion of many economists that, were Marx alive today, he would consider marginal 
analysis ideologically neutral and use it without hesitation. In this respect, see also the article by A. A. Konyus in 
the Strumilin Festschrift mentioned below. 

In the Promyshl k icheskaya Gazeta of March 19, 1958, p. 2, one A. Tolkachev discusses the pro- 
posed construction of input-output tables for a model of the whole economy with the help of high-speed electronic 
computers. Indeed, the economic institute of the Gosplan of the USSR appears to have started work in this direction 
in 1957. Also in Pravda of June 11, 1957, p. 2, V. Trapeznikov, corresponding member of the Academy of Sciences of 
the USSR and director of the institute for automation and telemechanics, writes: “... we must by all means 
strengthen our work in the field of econometrics. This will be of much help to our planning authorities under the 
new conditions of industrial administration.” For Boyarski in the book under discussion “econometrics” is still a 
dirty word. 

12 Elmer J. Working, “What Do Statistical ‘Demand Curves’ Show?” Quarterly Journal of Economics, Vol. XLI 
(February, 1927) pp. 212-35. 
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connection between integral and mean (p. 221), which almost echoes Menger’s" 
approach. Simpson’s rule is also given. In three pages a glimpse beyond, on 
differential equations, is afforded. 

With Chap. XII Boyarski switches back to statistics for the rest of the book. 
There is a discussion of the concept of density, of the relation between fre- 
quency and integral, of mode, median, mean deviation and standard deviation, 
and of the transformation of distributions. 

The chapter on the elements of probability theory (XIII) looks better than 
the rest; but then there already exist several excellent Soviet semi-popular 
expositions.’ The reader is introduced to the essentials of the binomial distri- 
bution, the limit theorem of de Moivre-Laplace,” the concept of mathematical 
expectation, Markov’s lemma (p. 293), which, in a slightly different form, we 
know as Chebyshev’s inequality,'® Chebyshev’s theorem (p. 294), which we 
usually call the weak law of large numbers (in a classical form),!? Bernoulli’s 
theorem (p. 297, another weak form of the law of large numbers'*), Lyapunov’s 
theorem (p. 298), a form of the central limit theorem,!* Poisson, geometrical, 
and exponential distributions, and a few pages on the distribution of sums of 
random elements. 

A chapter on random sampling (XIV) deals with the following topics: dis- 
tribution of sample mean and sample variance, a few remarks on stratified 
sampling and some on multi-stage sampling, the determination of the proper 
sample size with given criteria, and Student’s ¢. This is all right as far as it goes, 
and Boyarski appears to sympathize with what he teaches. 

This impression changes in Chap. XV on the testing of hypotheses. There is 
precious little substance: the main example given concerns the slippage of 
means. But Boyarski has grave doubts about the whole methodology (essen- 
tially Neyman’s and Pearson’s, who are never named). The main objection is 
that “accepted” simply means, or should mean, “not rejected,” given a level of 
significance.”° Boyarski simply cannot help missing a criterion of “absolute sig- 
nificance” (p. 345). The chapter does, however, explain a few simple chi-square 
and contingency-table techniques. 

The epilogue is used by Boyarski for pointing out the danger that in mathe- 
matical economics “matter vanishes, only equations remain” (p. 361). After a 
short, vague diatribe against idealist mathematicians, he engages in an indict- 
ment of the philosophy of Ernst Mach,” about which he says: “Unfortunately, 





8 Karl Menger, Calculus: A Modern Approach, Boston: Ginn and Co., 1955, e.g., p. 40. 

“4 The book by the Yagloms discussed below also falls in this category. 

% Cf., e.g. William Feller, An Introduction to Probability Theory and Its Applications, Vol. 1, Second Edition, 
New York: John Wiley & Sons, Inc., 1957, p. 172. 

Cf. Feller, op. cit., p. 219. But see also Michel Loéve, Probability Theory, New York: D. Van Nostrand Com- 
pany, Inc., 1955, p. 275. 

17 Cf. Feller, op. cit., pp. 141, 228. 

18 Cf. Feller, op. cit., p. 141. 

19 Cf. Feller, op. cit., p. 229. 

2 Actually, it has often been felt by “idealist” philosophers that there is something unpleasant about the cir- 
cumstance that the symmetry of “true” and “false” is not paralleled by “to be accepted” and “to be rejected” in 
the sense of “verified” and “falsified” or “known to be true” and “known to be false.” Standard refevences to the 
discussion of these matters are: Alfred J. Ayer, The Foundation of Empirical Knowledge, London: Macmillan, 1953; 
Karl R. Popper, Logik der Forschung, Vienna: J. Springer, 1935, with emphasis on the asymmetry. 

21 The relevance of this can only be understood if one knows that Mach happened to be Lenin’s pet foe in the 
philosophy of science. Why this sould have been so is again only intelligible if one knows that Mach’s teachings 
were regarded with favor by exponents of revisionism, which, Lenin felt, might corrode the Marxist foundation of 
the whole political cause he stood for. 
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this philosophical school made mathematical statistics its preferred field of 
activity. There is nothing astonishing in that: as soon as there are no (recog- 
nized) objective connections in the phenomena but only the ‘routine of our 
experience,’ usualiy established by frequent repetition of series of perceptions, 
the whole task of science leads to the counting of these repetitions, i.e. to their 
statistics. The disastrous effects of the Machist school of Pearson on mathe- 
matical statistics may well be seen by the example of that application which 
the exponents of this school make of various ‘criteria.’ We saw above the whole 
poverty of possibilities which these empirical (or empiricist) criteria opened 
up” (p. 361). “And then it is easy to build this ‘construct’ in such a fashion 
that it looks favorable to the bourgeois apologists” (p. 362). Therefore, Boyarski 
concludes, Soviet science has the task of “cleaning the powerful instrument of 
mathematical science from the endeavors of apologetic applications, and it 
must be freed from. . . elements of idealist philosophy” (p. 363). “One must 
keep in mind that, while the propositions of mathematical science by them- 
selves are not partisan, its utilization in the field of eco..omics is permeated by 
partisanship” (7bid.). 


2. THE STRUMILIN Festschrift 


Let us take up a book of completely different nature and indeed an un- 
usually intriguing and impressively substantical publication: the volume of 
essays” in honor of the 80th birthday of academician Stanislav Gustavovich 
Strumilin, who has aptly been called the enfant terrible of Soviet economics, 
economic history, and statistics. By sheer volume of output,” scholarliness, 


versatility, and boldness—occasionally thinking nothing of criticizing Marx 
himself**—he has no peer, and writing in his honor appears to have quickened 
some minds and lent wings to some pens. There are 29 articles in the volume 
with a range of topics that clearly transcends the scope of this journal. It is very 
likely that the contributions to economic history in this volume are the most 
solid and scholarly ones. Almost all papers use statistical data and statistical 
methods of presentation extensively. There is, on the other hand, comparatively 
little of what would pass as statistical analysis proper. We shall therefore single 
out for discussion only those essays that pertain to statistics or econometrics 
and make new and interesting theoretical points.” 





The most recent authoritative summary of the official Soviet position on pertinent philosophical problems can 
be found in Osnovy Marksistskoi Filosofii, edited for the Institute of Philosophy of the Academy of Sciences of the 
USSR by F.. V. Konstantinov, Moskva: Gospolitizdat, 1958, 688 pp., which, however, does not contain anything 
essential that has not been propagated for decades. 

2 V. S. Nemchinov, B. B. Kafengauz, L. E. Mints, A. E. Probst, T. V. Ryabushkin, P. A. Khromov (eds.), 
Voprosy ekonomiki, planirovaniya i statistiki, sbornik statei k vos’midesyatiletiyu akademika Stanislava Gustavo- 
vicha Strumilina, Moskva: Izdatel’stvo Akademii Nauk SSSR, 1957. 385 pp., 25 r. ‘ 

® The list of Strumilin’s publications contains well over 400 titles, including those which he edited. 

2 Nor does he stop there. See Harry Schwartz, “Economist Splits With Krushehev: Strumilin, Soviet Dean in 
the Field, Challenges 7-Year Plan's Key Features,” The New York Times, January 4, 1959, p. 22. 

% To give an idea of the range of topics, we give the minimal information on those papers which will not be 
discussed otherwise. G. M. Krzhizhanovski and V. 8. Nemchinov discuss aspects of Strumilin’s life. Then follows 
the list of Strumilin’s publications. G. M. Sorokin writes on the essence and methods of economic planning. V, S. 
Novikov presents a more programmatic paper on the problem of national accounting. A. I. Pashkov has a very 
“Strumilinesque” piece on the source of differential rent and its distribution under socialism, in which he takes issue 
with other Soviet economists’ interpretation of certain positions of Lenin in such matters, and he insists that what 
was correct 40 years ago may not be correct any more. T. V. Ryabushkin pleads for such economico-statistical in- 
dicators as might facilitate interregional and international comparisons; he then gradually changes the topic and 
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The first piece to merit attention is entitled “Cost and Value” (pp. 72-82) 
and written by the same Boyarski whose textbook we dealt with above. The 
paper starts with the declared purpose of lending (additional) support to the 
Marxist labor theory of value, but while this purpose is somehow lost sight of 
and plainly not achieved—Boyarski deserves praise for repeatedly pointing out 
that he is begging the question a little—something quite different is note- 
worthy and, as far as Soviet economic writing is concerned, definitely novel: 
Boyarski uses a simple linear three-equation model for the iterative cost-and- 
price determination within a coal-steel-and-electric-power combine (but the 
model easily admits of generalizations). Is it a belated discovery or Sovietiza- 
tien of a Walras-Goodwin type “tatonnement,”* the iterative, step-by-step 
groping for the “right” price formation? Or is ‘t a subtle argument for the de- 
sirability or at least harmlessness of more decentralization? In any event we are 
put in the realm of “linear economics.” Boyarski’s acknowledgments are inter- 
esting in themselves. He gives credit to an oral presentation of Ya. Shatunovski 
in 1926 (who was a noted algebraist™*), brought up again by one Liberman in 
1956. Boyarski professes, however, to have been influenced mainly by Stumi- 
lin’s most recent work. There is no reference to the work of Enrico Barone in 
1908, Vilfredo Pareto in 1909, H. D. Dickinson in 1933, Oscar Lange in 1938, 
Maurice H. Dobb in 1940, Abba P. Lerner in 1944, and others whom some 
people would have cited. But much more strangely, there is no reference to 
Soviet mathematician L. V. Kantorovich, whose relevant work and importance 
have recently been discovered in the West and who in 1939 appears to have 


invented linear programming,” later applying his techniques especially to trans- 
portation problems. 
Boyarski thinks of a combine whose mutually dependent cost structure hypo- 


thetically looks like this: 





points out real or apparent taxonomic shortcomings in Western statistics concerning the self-employed. I. P. Bardin 
writes about the influence of technological progress on planning and productive capacity in metallurgical plants, 
mainly in regard to Strumilin’s book on the history of metallurgy in the SSSR. G. D. Bakulev writes on the mecha- 
nization of production and the productivity of labor in the Soviet coal industry, with a plea for better, statistically 
objective, criteria for the economic efficiency of the automation of production, and for a wider publication of do- 
mestic and foreign experiences with automation problems. M. M. Rabinovich discusses aspects of the engineering 
economics of coke. L. I. Ulitski writes on the history of coking abroad. S. V. Slavin analyses the transportation 
development in the Soviet North. N. M. Druzhinin gives a detailed documentation of the agrarian conditions in 
the central Black Sea region toward the middle of the 19th century. B. B. Kafengauz deals with the problem of 
original capital accumulation in Russia (pointing out an interesting lack of agreement in this matter among economic 
historians). A. E. Probst writes on the wood and fuel policy of Peter the Great. V. K. Yatsunski analyzes the tech- 
nological revolution in the Russian paper industry, 1830-1850. A. L. Tsukernik writes about the salt industry of the 
Don basin in the 18th century. A. D. Gusakov is concerned with usury in the Kiev Rus’. V. 8. Martynovskaya dis- 
cusses V. G. Belinski’s views on the social and economic development of Russia. P. V. Khromov deals with pre- 
capitalistic rent in Russia. N. I. Pavienko contributes detailed statistical material concerning the number of laborers 
in Russian metallurgy around 1800. F. D. Markuson surveys the development of world population from 1900 to 
1950, but there are also illuminating figures, about the population of various continents during earlier centuries, 
and the author also quotes many a Western demographer. B. Ts. Urlanis’ topic is “Demography and the Lengthening 
of Life,” also with free, and properly acknowledged, use of non-Soviet sources. (Urlanis says that the average length 
of life in pre-revolutionary Russia was 32 years; in 1927, 44 years; now, 64 years.) M. V. Ptukha analyzes some 
results of the census taken in the Ukraine during the second five-year plan. A. G. Rashin describes the movement 
of wages for workers and employees in the Russian railroad service, 1884-1913. 

% For background and history, see Don Patinkin, Money, Interest, and Prices: An Integration of Monetary and 
Value Theory, Evanston, Illinois: Row, Peterson and Company, 1956, esp. pp. 377-85. 

%® The reference is of dubious accuracy. The noted algebraist (mentioned also by A. G. Kurosh, Kurs vysshei 
algebry, Moskva: Gostekhisdat, 1956, p. 15) is S. O. Shatunovski (1859-1929). 

% This was pointed out to me by Gregory Grossman and Robert W. Campbell. The latter is the translator of 
Kantorovich’s relevant work. 
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This can be translated into three cost functions: 
steel cost (in chosen units) = p = 2u + e + 320, 
coal cost (in chosen units) = u = .06p + .le + 48, 


power cost (in chosen units) = e = .05u + 16. 


Now, says Boyarski, if the costs were centally determined and fixed—each 
member of the combine transferring its product at cost—the equilibrium solu- 
tion to be computed would be: p= 500, u=80, e=20. 

But suppose this centralized computation was not performed. Instead let 
each member of the combine announce its cost par hasard, out of the blue, as 
it were. Let, for instance, the steel plant announce that the steel cost (and price) 
is 200. Likewise, let the initial coal cost be 120 and the initial power cost 100. 


Then the three first-stage cost functions are: 
pi = (2)(120) + (100) + 320 = 660, 
u, = (.06)(200) + (.1)(100) + 48 = 70, 
é: = (.05)(120) + 16 = 22. 


Suppose these newly computed costs are communicated at once as the new 
prices. Then the second-stage cost functions are: 


D2 = (2)(70) + (1)(22) + 320 = 482, 
us = (.06)(660) + (.1)(22) + 48 = 89.8, 
e2 = (.05)(70) + 16 = 19.5. 


It is at this stage that Boyarski introduces a new idea. He wants to show, not by 
rigorous proof but by illustrative example, that even the wildest initial prices, 
by iterative cost determination, have a way of converging toward the equilibri- 
um prices.?” Boyarski says that inasmuch as the steel mill and the power plant 
see that their products now have become cheaper (according to this calculation) 
“they will not be in a hurry to communicate their new prices” (p. 74)! The coal 
pit, however, announces its new price. Then: 





27 That such convergence need not, in general, take place, is of course known since, say, Abraham Wald, “On 
Some Systems of Equations of Mathematical Economics,” Econometrica, Vol. 19 (1951), pp. 384-385; for more 
background see sources given by Patinkin, op. cit. (footnote 25), p. 383. 
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Ps = (2)(89.8) + (1)(22) + 320 = 521.6, 
U3 = UW, 


es = (.05)(89.8) + 16 = 20.5. 


By now the cost information has leaked through to the coal pit; also, the power 
plant is assumed to announce its new higher price. Therefore: 


Ps = (2)(89.8) + (1)(20.5) + 320 = 520.1, 
us = (.06)(521.6) + (.1)(20.5) + 48 = 81.3, 
é, = (.05)(89.8) + 16 = 20.5, 


and after a few more rounds, Boyarski says, we come pretty close to the 
equilibrium costs-and-prices, even if the appropriate information is temporarily 
withheld by the members in pursutt of their interests. 

The incorporation of informational policies for the members is intriguing. 
Actually, in Boyarski’s world the transition from the first to the second stage is 
already dubious. Further, it is of course only true under particular demand con- 
ditions that it is always favorable to try to stick to a higher asked price for 
one’s product. But such intricacies, which would soon lead to a game model 
with curious signaling strategies,?* are not envisaged by Boyarski. He sees, 
however, that the assumption of constant technological coefficients may be 
overstrained here. Above all, that other costs remain the same throughout the 
game, especially wages, does not satisfy him, either. He says that the regulation 
of wages, especially with a view to heterogeneous skills, remains one of the most 
important tasks, and only if one could already take for granted that wages 
mirror effort, would one be justified in claiming for values what the model allows 
one to claim for prices.2® Boyarski also investigates what happens (assuming 
that the system does converge) when profit is sought by a mark-up, say, as 
some percentage of total cost. Suppose prices, determined jointly or by itera- 
tive process, were proportional to amounts of effort spent on the production of 
the respective products. Then the percentage-of-total-cost mark-up destroys 
the proportionality. (If profit were a percentage of the wage bill, the propor- 
tionality would be preserved.) In other words, it turns out the full-cost prin- 
ciple does not lead to prices which are in accordance with labor values. But 
that should not bother an adherent to the labor theory of value; the rationality 
of the full-cost principle for profit-maximization is very restricted anyway.*° 
Toward the end of his paper Boyarski discusses some of the necessary and 
sufficient conditions for the convergence of his prices, but by abstracting from 
informational policies—his best idea. Also, he might have discussed the very 





28 For a stimulating discussion of such aspects and further literature see R. Duncan Luce and Howard Raiffa, 
Games and Decisions: Introduction and Critical Survey, New York: John Wiley & Sons, Inc., 1957, esp. pp. 161-2. 

2 Concerning the comparability of heterogeneous skills and efforts, Boyarski is remarkably cautious and more 
agnostic than, say, the liberal conservative Maurice Allais, T'raité d’économie pure, Paris: Imprimerie nationale, 
1952, with his notions of “laborie” ete. 

30 For a fresh point of view see Karl Christian Kuhlo, “Die Qualitit als Instrumentalvariable beim Vollkosten- 
prinzip,” Ifo-Studien, Vol. 2 (1956) pp. 221-238. 
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relevant speed of convergence.* Still, Boyarski’s paper could well be a new 
point of departure. 

Noteworthy for quite different reasons is the contribution of F. D. Livshits, 
“A Comparative Characteristic of Three Kinds of Keeping Account (Uchot)” 
(pp. 359-387). It contains, incidentally, the statement that “ .. . socialism— 
that is, above all, uchot (keeping account, records, statistics)” (p. 362) (some- 
thing which Colbert said of government in general). The paper is in the main 
a brief history of economic-statistical thought, teaching, and practice in the 
Soviet Union, and a severely critical one. As early as 1872, the Russian statis- 
tician M. Baten’kov pleaded for a closer coordination of accountancy and sta- 
tistics, or rather for a subordination of bookkeeping to statistical sense. This 
is actually the theme Livshits resumes. He names many a name and quotes 
with angry question and exclamation marks—many a quote in cases, appar- 
ently plentiful, where statistics and statisticians were completely subjugated by 
accountants and their narrow, but prevailing, viewpoints. Also, he accuses his 
fellow statisticians of lack of stamina in fighting back. He finds fault with and 
ridicules the characterizations, in Soviet literature, of the respective roles to 
be played by accountancy, statistics, and a third kind, called operativnyt uchot, 
perhaps best translated as comprehensive managerial control, including quality 
control, inventory control, and time studies. In particular, he combats the no- 
tion, apparently widely held, that anything but statistical analysis of the high- 
est standards suffices for operativnyt uchot, and he feels it is high time to make 
the latter truly scientific. “But, above all else, it is necessary decisively to liber- 
ate this kind of keeping account from the inventions and lies of all kinds, which 
have accumulated in the works of our theorists and which, unthinkingly re- 
peated for many decades, have durably acquired the evil power of an ossified 
‘tradition’ ” (p. 387). There are only two people who do not appear to fall under 
Livshits’ indictment: Strumilin and, by implication, Livshits himself. 

The effective strategy of condemning the bulk of Soviet workers in the field 
by paying ample tribute to Strumilin is also masterfully used by A. A. Konyus, 
who in the very title of his paper manages to convey the impression that what 
he writes is merely an exposition of work of Strumilin: “Theoretical Problems 
Concerning Prices and Consumption in the Works of S. G. Strumilin and Roads 
to Further Research on Them” (pp. 405-419). Konyus is of course the author of 
a highly original contribution to the economic theory of index numbers.” It 
speaks perhaps loudly for Konyus’ “orthodoxy” that recent work of his, to 
which he refers, did not appear in one of the more obvious economic journals 
but in the proceedings of the institute of mathematics and mechanics of the 
academy of science of the Uzbek Soviet Socialist Republic.* 

At first Konyus expounds the essentials of a theory of price and subjective 
utility* which in form and content is almost indistinguishable from what every 


31 For a detailed discussion and critique of iterative-computation aspects that are relevant here, see E. Bodewig, 
Matriz Calculus, Amsterdam: North-Holland Publishing Company, 1956, pp. 125-181, esp. 153; R. A. Buckingham, 
Numerical Methods, London: Pitman Publishing Corporation, 1957, pp. 423-445; or V. N. Fadeeva, Vychislitel’nye 
metody lineinoi algebry, Moskva: Gostekhizdat, 1950. : 

32 A. A. Koniis, “The Problem of the True Index of the Cost of Living”, Econometrica, Vol. 7 (1939) pp. 10-29. 
(The spelling “Koniis” is an unusual alternative to “Konyus” or “Konius.” “Kénus,” however, is indefensible.) 

% Issue 10, Part 1, 1953, pp. 81-85. 

* That there may be absolutely no misunderstanding, he gives the English designation “Utility” and the German 
“Gebrauchswert” (p. 407). 
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beginning Western economist now learns in practically all textbooks,® with the 
result, for instance, that in equilibrium—when the consumer maximizes his 
utility, subject to an income constraint—the ratio of marginal utility to price 
is the same for all goods. By making the appropriate a:sumptions, he then finds 
little difficulty in also characterizing optimality such that the ratio of the margi- 
nal utility of a unit of a commodity to the effort expended on the production of 
it comes out the same for all commodities. 

Pigou® would find little to object to here. What is perhaps somewhat surpris- 
ing is Konyus’ allegation that a marginalist theory of subjective value, usually 
contrasted with labor, or more generally cost, theories of value, was clearly 
envisaged by Marx and even emphasized as important. To this effect Konyus 
refers the reader mainly to passages in the Theories on Surplus Value.*7 Whether 
the bridge between subjective and labor theories of value can be constructed or 
not, one thing is clear, and Konyus stresses it (p. 411): the optimality that 
goes with the uniform proportionality of prices to values is destroyed if com- 
modities are rationed, i.e. if prices are not the sole regulators of consumption. 
For instance the price of housing in the Soviet Union is, he says, considerably 
lower than its value, and cases might be conceivable where a dweller with more 
room than he really needs still would not renounce it because of its cheapness; 
if, however, it were more expensive, in line with values, he might wish to give 
up dwelling space in order to enjoy other goods and optimize the use of his 
income. Konyus also quotes extensively (p. 411) from D. D. Kondrashev’s 
Problems of Price Formation in the USSR. Kondrashev is apparently one of the 
main critics of the prevalent practice of readily justifying obvious divergencies 
between prices and values by social-welfare arguments because, he believes, 
prices then do not “show how much labor should be allocated to the various 
kinds of production from a macro-economic point of view,” and he feels that 
only temporarily, and for very compelling reasons, should prices be allowed to 
get out of their close relation to values. Also, it is realized by Kondrashev that 
proper prices would “strengthen the economic-calculatory stimuli in the 
economy.” Konyus agrees that it is all right for medicine to be sold below its 
value: people who need it may otherwise misperceive and underestimate the 
good it does them. For the opposite welfare reason, it is only proper that vodka 
should be sold for more than it is worth since, Konyus says somewhat illog- 
ically, only a very limited part of the population attributes use-value to “... 
this poison”... “and then, so to speak, a negative one” (p. 412). 

Konyus then deals with problems posed to his theory by the phenomenon of 
technologically joint supply, and he recognizes that “commodities which have 
completely identical consumption properties command the same price, no mat- 
ter how different the production conditions” (p. 413). He asserts that the 





% See, for instance, Allen, op. cit. (footnote 2), pp. 126, 289-291, 313-314; Patinkin, op. cit. (footnote 25), pp. 
7-12, 285-288. 

*% Arthur Cecil Pigou, The Economics of Welfare, Fourth Edition, London: Macmillan, 1932. 

37 Konyus indicates the following Russian-language sources: Memories of Marx and Engels by M. Kovalevski 
(1956), p. 34, according to whom Marx resumed the study of mathematics, of differential and integral calculus, in 
order to appraise conscientiously the mathematical trend “which just then emerged in political economy”; Marx, 
Theories on Surplus Value, Vol. [II, 1936, pp. 174, 194, ibid., Vol. II, 1936, part 2, pp. 171, 222; Marx and Engels, 
Works, Vol. V, 1929, p. 452; Vol. XXII, 1929, p. 327; Vol. XVII, 1937, p. 96; and in later contexts: Vol. XV, 1933, 
pp. 63-64; Vol. XVIII, 1939, pp. 162, 177, Vol. XII, part I, 1933, p. 181; Vol. XIX, part I, 1939, pp. 176, 308; Vol. 
XXIV, 1951, p. 55. The Marx specialist will perhaps say that all this has been common knowledge; he will admit, 
however, that it is knowledge that is usually emphasized by neither protagonist nor antagonist. 
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arithmetic mean of the values at the various plants or in various regions consti- 
tutes the common value (which, incidentally, looks like quite an arbitrary solu- 
tion). He feels it is in keeping with the labor theory of value—which, after all, 
he is out to complement, not supersede, by the subjective value theory—to 
conduct the discussion not in terms of individual commodities (and their prices 
and values) but rather in terms of whole groups of commodities. An example 
comprising coal and mineral oil shows how wide the contemplated aggregates 
are. 

Individual plants with cost advantages engender, of course, a differential 
rent—a perennially fascinating subject for Soviet economists. In that context, 
one I. I. Kozodoev is quoted as holding that “the labor exerted in the particular 
plant is the source** of a differential rent... .” But “if the individual (plant) 
value of the commodity exceeds its market value, then this ‘residual’ value does 
not go to those producers with whom the individual (plant) value is lower than 
the market value, but is simply not realized” (p. 413n). And Konyus adds that 
the possibility of redistribution of surplus value is one of the most important 
propositions of Marxist political economy (zbid.). 

The upshot is that what may not hold in the economic microcosmos may hold 
for aggregates. Or rather, what does not hold individually can be made to hold 
by appropriate aggregation : “The sum of the prices of iron and copper must cor- 
respond to the sum of their values” (p. 414). No wonder, then, that “the exact 
determination of value (i.e. the amount of socially necessary labor embodied in 
the commodities) is, as is well known, an extremely difficult problem” (p. 415). 
Particular difficulties arise with what Engels called “goods with a long period 
of wear and tear,” and Konyus raises the “important question which prices of 
durables... must equal the values—production prices or arendnye platy 
(literally: rental payments—the sum of discounted future quasi-rents**). This 
problem, Konyus says, has never been posed, but he adds at once that Engels 
in a concrete case “categorically” took the latter option.‘° Lenin enthusias- 
tically adopted Engels’ view. Konyus adduces all this in order to lend empha- 
sis to the (apparently neglected) role of consumption; it appears, however, that 
as a by-product the quasi-rent type theories of investment—and what is more 
“marginal” theory?—are by implication made canonical. That this interpreta- 
tion is not too far fetched becomes clear in a later line: “For the conditions of 
the Soviet socialist economy it follows in particular that the method of profita- 
bility computation ... that is practiced by us... does not possess a clear 
economic meaning” (pp. 417-418). So much about a paper which for all its 
Marx-and-Engels citations appears fresh and bold. 

Ya. I. Lukomski writes on “The Aggregative Index Form and the Measure- 
ment of the Productivity of Labor” (pp. 458-470). What is here meant by the 
aggregative index form is 


Dra’ / Digi, 


i=1 i=l 





38 My italics. 

39 Cf. Friedrich and Vera Lutz, The Theory of Investment of the Firm, Princeton, N. J.: Princeton University 
Press, 1951. 

40 Marx and Engels, Works, Vol. XV, 1933, pp. 63-64 (Russ. ed.). 

“ V. I. Lenin, Works, Vol. 35, p. 225 (Russ.). 
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where 7,’ is the amount of labor spent on the production of one unit of the i-th 
commodity in the current period; 7; is the analogous value in the base period; 
qs is the eutput of the i-th commodity in the current period. Lukomski men- 
tions that too many statisticians share the opinion that in index theory the 
fundamental problems have already been solved. Also, he feels that while the 
economic content of the above formula is straightforward and it has been 
recommended as a matter of course in textbooks, its possibilities have never 
been fully explored. Lukomski sets out to mend that situation. 

In accordance with a Marxian formula,” the productivity of labor P; is the 
output gq; of a given commodity measured in physical units divided by the 
labor time 7; spent on it. An aggregation over commodities analogously yields 
the formula: 


P= W(n, coe qn)/T, 


where W(qi, - - : , dn) is the use value (potrebitel’naya stoimost’) of the whole of 
the outputs gi, +--+, qn. And T is the labor time spent on total production. 
Take note empiricists, reductionists, Peirceans and Bridgmanites: Lukomski 
writes of the function W that its existence is independent of the possibility of its 
practical determination; indeed its postulated existence, not measurability in 
principle, is the point of departure for Lukomski’s further train of thought. It 
becomes clear that he views W as very unstable over time; characteristics and 
modes of use are discovered; tastes developed; needs extinguished. All this is 
good Marx. So actually W is a function of time, too. However, Lukomski cir- 
cumvents introducing an additional variable ¢, holding that temporal change is 
to be abstracted from insofar as it does not come out in the changing q’s them- 
selves. The shape of W depends—it is again emphasized by Lukomski—on the 
attitude of the consumer to the commodities produced, and this attitude may 
differ from consumer to consumer. “When the socialist society as a whole ap- 
pears as consumer (and it is from this point of view that, we here consider use 
value),” Lukomski continues without bothering about aggregation over indi- 
viduals, “ . . . the production program of the various productive organizations 
of our society is partially conditioned by the location, cooperation, and other 
relations between these organizations. The relations determine for each organi- 
zation some more restricted circle of ‘main consumers’ of the mass of produced 
commodities (and of suppliers of inputs)” (p. 461). Translated into another 
jargon, what Lukomski is getting at is this: Actual output and its distribution 
is determined by the way society is organized. Actual output and its distribu- 
tion determine, with a time lag perhaps, consumer preferences. Consumer pref- 
erences, while conditioned by the whole social structure, can only imperfectly 
be represented because of communication and power constraints inherent in 
the very same social structure. Thus, the social order breeds the very tastes 
which it must partially frustrate and can only imperfectly satisfy without 
changing itself. “The shape of function W depends on the possible utilization 
of the non-assortment (i.e. not-wanted-by-the-consumer) production” (p. 
464), thus becomes an intelligible and even profound observation. 

The function W is monotonically increasing in the g’s, Lukomski says, and in 
capitalistic society W increases with the q’s at diminishing rates when produc- 





42K. Marx, Critique of Political Economy, Russian 1938 edition, p. 17. 
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tion overtakes consumption; the unsold part has of course less use value than 
the actually consumed part. “In our socialist economy, where production . . . 
is planned according to demand, one can within rather broad limits assume that 
W is a linear function of each of the arguments q;,” he writes (ibid.) and a foot- 
note explains that “it therefrom follows that in a socialist society, from a cer- 
tain moment on, to each increase in productivity must correspond a larger 
increment in use value than is the case in a bourgeois society.” 

Lukomski considers a production program for n different kinds of out- 
put. While W(q:, ---, 4.) continues to be the use value of output vector 
(q1, - ++, @n), W(o1, -- +, dn) is the use value of qg; when q:, - - - , gx are pro- 
duced as before. Likewise, W(q:, - - - , gi, * * * » Qn) is the use value of q;, other 
outputs remaining the same. Lukomski has no qualms about the following 
assumption : 


Wm, +++, Qn) = Wg, +++, Gn) + Wg, Gr, °° +, In) Hoe 
+ W(q, > +, qn). 
Also, he assumes: 
W(qi, €2, €3,° °°, €n) = CM, 
W (ex, Ga, €3, ** * 5 €n) = Cae, 


W (ei, ote Gn) = CnQn; 


where ¢;, - - - , e, are “arbitrary numbers (production figures) within certain 
bounds.” That is, use values attaching to commodities are now assumed to be 
proportional to their levels of output, as long as the level of production of other 
commodities stays within certain bounds. 

The coefficients c:;, - --,¢,, thus become the use values of a unit of the 
respective commodities—strictly speaking for a moment of time, but Lukomski 
makes them constant through time. So we arrive at 


W(n, wh ee Qn) _ > CiQi- 
il 
The productivity-of-labor measure P; for the i-th commodity thus becomes 


P; = cqi/Ti. 


For the productivity of labor generally, Lukomski obtains by obvious substitu- 
tions: 


Ciqi 
P= Zen / EF 


ag 
Finally, his productivity-of-labor index is defined as 





wes P’/P ie Ym Ciqi / es Ci ; 


cq’ Ci 
> aL 


P; P; 
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where the prime identifies values for the current period, lack of prime indenti- 
fying base-period values. 

In the remaining pages, Lukomski investigates the consequences of working 
with +,=7';/q;—the labor intensity of a productive process—and the relations 
between his I and the aggregative index form mentioned at the beginning, all 
of which is fairly routine analysis. The main interest of Lukomski’s approach 
lies no doubt in the particular assumptions he makes and defends. 

The last contribution to the Strumilin volume—and from a statistical point 
of view probably the most workmanlike—is L. E. Mints’ “Experience with the 
Application of Correlation [Methods] to the Study of the Economic Efficiency 
of Geophysical Methods of Oil Prospecting” (pp. 471-483). Mints reports on a 
rather thorough-looking correlation and regression study in which the eventual 
crude oil yield was regressed on variometric, gravimetric, and seisimetric data 
and on the number of deep exploratory drillings. A series of linear regression 
equations, with from one to three regressors in various combinations, including 
multiple and partial correlation coefficients, multiple and partial regression 
coefficients, standard errors of estimate, standard errors of the regression co- 
efficients, etc., were computed by -various approaches, the most interesting 
being the one with “Chebyshev polynomials” according to a computational 
design by Nemchinov® using Fisher’s F as a “measure of reliability.” The 
paper concludes with a discussion of the results of the analysis and draws cau- 
tious and convincing conclusions from them. 

So much for Strumilin’s birthday volume. 


3. RYABUSHKIN’S “STATISTICAL METHODS FOR THE STUDY OF THE ECONOMY” 


A 1957 book by Timon Vasil’evich Ryabushkin is called Statistical Methods 
for the Study of the Economy,“ and such a title appears to promise something 
along econometric lines. A brief description of the contents is given before the 
introductory chapter: “In this book methodological problems in the study of 
the economy as a whole are illuminated; the system of macroeconomic statis- 
tical series, the classification of economic sectors, proportions and interdepen- 
dencies, the comparability and commensurability of statistical series of sectors 
of the economy. The book is intended for economists, statisticians, and plan- 
ners.” The relation to econometrics is remote. In comparison with contempo- 
rary American books, it themetically comes perhaps closest to the Ruggles’ 
text.“ In general style and tone it resembles even more a moribund branch of 
German statistical literature where the accent was on taxonomy and so-called 
Sachlogik and where practically all statistical techniques presented were of a’ 
descriptive, non-inferential, kind. 

Ryabushkin displays an extraordinary amount of detailed knowledge of 
world statistics. He knows how they measure corn acreage in South Africa, 
working days in Mexico and Turkey; he has studied the U. S. Standard Indus- 





* Unfortunately, V. 8. N hinov, Poli y Chebysh i t ticheskaya statistika, Moskva: Izdatel’stvo 
AN SSSR, 1946, apparently tial for a complete understanding of Mints’ paper, was not available. 

“ T. V. Ryabushkin, Statisticheskie metody izucheniya narodnogo khosyaistva, Moskva: Gosudarstvennoe Sta- 
tisticheskoe Izdatel’stvo, 1957, 288 pp. 

Richard Ruggles and Nancy D. Ruggles, National Income Accounts and Income Analysis, Second Edition, 
New York: McGraw-Hill Book Company, Inc., 1956. 
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trial Classification Manual of 1941-1942; and he knows the pitfalls with “fob” 
and “cif” figures. 

He advocates the collection, organization, and publication of a number of 
national and regional statistical series and criticizes extant ones in great detail. 
Much of what he says, though perhaps not said for the first time, appears very 
well taken. There are plenty of interesting side glances at Soviet statistical 
practices. For instance, “We have had a general stock-taking of Government 
industry only once, namely before the beginning of the first five-year plan. . . . 
It was intended to conduct a general stock-taking at the end of the fifth five- 
year plan. A program was set up, instructions were worked out, etc. For several 
years this measure has been postponed. A general stock-taking took place 
recently in Czechoslovakia” (p. 32). 

It is also illuminating to see how, after categorizing economic activities with 
a steady eye on Marx, Ryabushkin admits that “there is of course no absolutely 
hard and fast distinction between social and non-social labor” (p. 34). Things 
made below a certain quality do not become part of the social product, “for 
society in this case does not recognize the social character of labor spent on the 
making of these things” (ibid.). And the production of armaments in capitalis- 
tic countries “ ... is useful only from the imperialists’ point of view and in- 
dubitably noxious from the point of view of the mass of the people; in addition, 
it is noxious even from the point of view of the capitalistic economy since it 
creates the conditions for its ruin” (p. 35). But of the treatment of losses 
through economic waste he says, “The problem of measuring. . . [losses 
through waste] is very complicated and completely unsatisfactorily worked 
out. So far there is not even a rigorously established classification of losses” 
(p. 37). 

After discussing what, in his opinion, should be, he deals with what is and 
describes Soviet statistical practice. “Unfortunately,” he says (p. 59), “there 
has been issued a complete description of [our] system of statistical series only 
once. We mean the book published by TsUNKhU SSSR, Materials for the Con- 
struction of a System of Statistical Accounts of the Economy of the USSR (Moscow 
1932)” (p. 59). He freely criticizes many shortcomings in all later similar publi- 
cations, too. Western readers of the 1956 manual Narodnoe Khozyaistvo SSSR 
will concur with Ryabushkin’s verdict: “Its shortcoming is the overburdening 
with percentage figures. Concerning very many series we do not find in it data 
in absolute numbers.” Also, it “should not yet be regarded as a developed sur- 
vey of the economy, which would afford the full possibility for a thorough sci- 
entific analysis of macroeconomic processes” (p. 71). Remarkably enough, 
Ryabushkin never tries to justify such things on grounds of militarily necessary 
secrecy. 

He then deals with classificational problems pertaining to economic sectors. 
He does not like schemes developed in other countries; they betray a “meta- 
physical” approach (p. 74). He also finds fault with UN classifications: “They 
lack a truly scientific, economico-theoretic foundation and represent only the 
attempt to systematize the existing classificational practices of various capi- 
talistic countries, minus some of the most glaring contradictions” (p. 74). As 
he sees it, “production is the main, but not the only, content of the economy” 
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(p. 75). He also feels that “bourgeois statistics has not worked out a scientific 
concept of sector . . . or of the whole economy, for that matter” (p. 75). So he 
gives a definition of “economic sector”: “The totality of enterprises and insti- 
tutions that fulfill essentially identical functions in the system of social repro- 
duction. Just in this sense one talks of industry, agriculture, trade, the govern- 
ment apparatus, etc., as of sectors of the economy” (p. 76). 

After reproaching bourgeois statistical classifications for having followed 
“the purely empirical road” (p. 86)—on page 74 indulging in metaphysics was 
the sin—he concedes that “of course, in scientific classification, we can never 
completely free ourselves from some elements of uslovnost’ [conventionality or 
conditionality | since there exist mixed technico-economic types of enterprises” 
(p. 87). 

But for the remainder of the book bourgeois statistical classifications con- 
tinue to be attacked. Ryabushkin discusses, in considerable detail, Leontief 
matrices, input-output systems of account. “They deserve careful study. At the 
base of their construction is an idea that is akin to our so-called chessboard 
balance sheet of production with which our statistical service experimented be- 
fore the second world war” (pp. 231-232). There are the customary criticisms 
—from the constancy of the technological coefficients (p. 233) to the role played 
by prices (p. 241). Western input-output work is also bad because it fails to 
point out “contradictions in the economy of capitalistic countries” (p. 233). 
The defense-planning aspects of Leontief tableaux are bad, too, for capitalistic 
countries (p. 234). For a hint of what an inter-sector study for a capitalistic 
country should be like, Ryabushkin gives what looks like a brief account of 
Ruth Mack’s shoe-leather-and-hide case study“ (p. 234), but he mentions 
neither her name nor her book. 

The upshot is that, in spite of scathing criticism of Leontief’s work (see es- 
pecially pp. 242-243), Ryabushkin thinks that “nonetheless one must wish that 
the working-out of general inter-sector tables be undertaken by us, too” (p. 
251). 

A final chapter is concerned with the international comparability of statis- 
tical series—with rather negative conclusions. (This job was similarly under- 
taken earlier by Maslov’s Critical Analysis of Bourgeois Statistical Publica- 
tions.*") 


4, YAGLOM AND YAGLOM’S “PROBABILITY AND INFORMATION” 


A booklet by the Yaglom brothers,** Probability and Information, happily 
continues the series of competent, semi-popular expositions of mathematical 
subjects for which Soviet scholars have deserved high praise. Shannon’s original 
communication-theoretic paper*® was lucid enough to catch the imagination of 





“ Ruth P. Mack, Consumption and Business Fluctuations: A Case Study of the Shoe, Leather, Hide Sequence, 
New York: National Bureau of Economic Research, 1956. 

47 P, Maslov, Kriticheskii analiz burzhuaznykh statisticheskikh publikatsii, Moskva, 1955. 

48 Akiva Moiseevich Yaglom i Isaak Moiseevich Yaglom, Veroyatnost’ i Informatsiya, Moskva: Gosudarstven- 
noe Isdatel’stvo Tekhniko-teoreticheskoi Literatury, 1957, 159 pp. Apparently the Yagloms have been engaged 
before in skillful mathematical exposition. They are the authors of a piece called “Non-elementary Problems in 
Elementary Rendition.” 

49 Claude E. Shannon, “A Mathematical Theory of Communication,” Bell System Technical Journal, Vol. 27 
(1948), pp. 379-423, 623-656. 
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the layman; Goldman’s standard text,®° too, was a remarkable expository job. 
Even so, these and other writings have perhaps retained too much telecom- 
munication-engineering flavor to provide sufficient motivation for non-engi- 
neers. It is the chief merit of the Yagloms to have written an introduction to 
Shannon-type communication theory which, by motivation and execution, 
makes it part and parcel of statistical theory, or rather its combinatorial foun- 
dation. 

The first chapter is an introduction to the concept of probability. It is simple 
and clear, presupposes next to no mathematical preparation, but leads up to 
and includes Boolean algebra and partial-order relations. Chap. 2 explains 
entropy as a measure of indeterminacy; conditional entropy (pp. 43-55); and 
finally quantity of information as a measure of how much an outcome a dimin- 
ishes the indeterminacy of some 8, as well as related concepts. At each step, 
there are examples worked out in detail and problems. Chap. 3 deals with the 
“Solution of Some Logical Problems with the Aid of Information Computa- 
tion.” A simple example (p. 69): Assume it to be known that the inhabitants of 
city A always tell the truth and that the inhabitants of city B always lie. An 
observer N knows that he is in A or B but he does not know which. People that 
N meets may of course come from A or B. What is the least number of ques- 
tions (to be answered by “Yes” or “No”) which N must ask to find out where 
he is?! Something more difficult (p. 87): There are N coins of identical denomi- 
nation. One of these is counterfeit; it is known to be lighter or heavier than the 
others, but one does not know which. What is the least required number of 
weighings on an equi-arm balance without weights for finding the counterfeit 
coin and determining whether it is lighter or heavier? The fourth chapter dis- 
cusses applications to the determination of economical code systems, and to 
telecommunication problems generally, also with noise. An appendix is devoted 
to properties of convex functions; another to some inequalities, which are estab- 
lished by detailed proofs. All in all a very enjoyable booklet, parts of which one 
would like to put on American college reading lists. 


5. SUMMARY 


Roughly summarizing the impressions gained from this admittedly all-too- 
small biased sample of Soviet 1957 writings that have some bearing on teaching 
and research in economic statistics, one must freely recognize some excellent 
work. Yet reading this literature is by and large an unrewarding job. On the 
average, the quality of Soviet output in this field varies with the cube of its 





8° Stanford Goldman, Information Theory, New York: Prentice-Hall, Inc., 1953. 

5! Making the simplest assumption of equiprobability, N should ask, “Do you live in this city?” An affirmative 
answer puts him in A, a negative one in B. 

8 The authors mention in a footnote that a more complicated version of the problem (where the number of 
counterfeit coins is larger than one or unknown) was given in the Bulletin of the American Mathematical Society 
in 1956 and that it is relevant to the modern theory of computing machines. 

Apparently the reference is to S. 8. Cairns, “Balance Scale Sorting,” Bulletin of the American Mathematical 
Society, Vol. 62, (1956) p. 177, Abstract 260t of a paper read at the Houston meeting, which in part says: “Given 
(1) a set W of n objects, indistinguishable save that the members of a subset H of h objects are slightly heavier than 
the rest, (2) a balance scale, one seeks weighing programs minimizing either [Problem M(n, h) ] the maximum num- 
ber of weighings which may be required to cull out H or [Problem E(n, h) ] the expected number of such weighings. 
Problem M(n, 1) is a familiar puzzle. Problem E(n, 1) is here solved under various hypotheses. Problem M(n, 2) is 
partially solved... .” 
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distance from the relevance of political postulates whose narrowness, 19th- 
century obsolescence, inconsistency, and double standards for evaluation must 
more and more be felt as methodological brakes, rather than didactic boosters, 
in serious Soviet work. In the neighborhood of the purely mathematical, the 
level of performance goes up fast; the writing becomes limpid and spirited.™ 
He who plows through the jejune verbosity that characterizes so much Soviet 
literature finds occasional nuggets of technical expertise next to the blatantly 
trite; shrewd competency amidst the sour and sullen; pieces of quiet wisdom 
beyond mere slogans. 

Everybody knows that Soviet scholarship in mathematics generally and in 
probability theory in particular is of the highest order. But there is certainly 
nothing comparable to the Wiley Series in Statistics, for instance. If for eco- 
nomic applications of inferential statistics, and for economics proper, the dis- 
tance from Western standards has been shortened in recent years, such a fact 
is not yet clearly inferable from the sampled 1957 Soviet writings. 





53 See also George E. Forsythe’s brief introduction to his Bibliography of Russian Mathematics Books, New 
York: Chelsea Publishing Company, 1956. 

Also of interest is the well-informed tribute Forsythe pays in particular to Russian work on numerical analysis 
in his (and Paul C. Rosenbloom’s) Numerical Analysis and Partial Differential Equations, Surveys in Applied Mathe- 
matics V, New York: John Wiley & Sons, Inc., 1958, pp. 8-21, 38-42. 

Finally, with a view to Boyarski’s Mathematics for Economists, it should be mentioned that the Soviet economice 
student who wanted to study calculus on his own had the choice of several excellent introductory calculus texts, 
which are indeed more suited for self-study (and often more carefully written) than many of their counterparts we 
use in the West. Take, e.g., Anisim F. Bermant, Kurs matematicheskog liza, iad. 11, 8, Moskva: Fizmatgriz, 
1958, 2 vols., 466 +358 pp.; Nikolai S. Piskunov, Differentsial’noe i integral’noe ischisleniya dlya Vtusov, Moskva: 
Gostekhizdat, 1958, 844 pp.; Nikolai N. Luzin, Differentsial’noe ischislenie, izd. 6, Moskva: Gosudarstvennoe iz- 
datel’stvo “Sovetskaya nauka,” 1958, 473 pp., and by the same author and publishing house, Integral’ noe ischislenie, 
izd, 6, 1958, 415 pp. 








PUBLICATION DECISIONS AND THEIR POSSIBLE EFFECTS ON 
INFERENCES DRAWN FROM TESTS OF SIGNIFICANCE 
—OR VICE VERSA* 


TuEopore D. STERLING 
University of Cincinnati 


There is some evidence that in fields where statistical tests of signifi- 
cance are commonly used, research which yields nonsignificant results 
is not published. Such research being unknown to other investigators 
may be repeated independently until eventually by chance a significant 
result occurs—an “error of the first kind”—and is published. Significant 
results published in these fields are seldom verified by independent 
replication. The possibility thus arises that the literature of such a field 
consists in substantial part of false conclusions resulting from errors of 
the first kind in statistical tests of significance. 


T HAS become commonplace to speak of a “level of significance” in reporting 
I outcomes of experiments. This significance level refers to risks of rejecting 
the null hypothesis, Ho, erroneously, and seemingly, has no other direct rela- 
tionship to experimental work. The experimenter who uses so called tests of 
significance to evaluate observed differences usually reports that he has tested 
H, by finding the probability of the experimental results on the assumption 
that Ho is true, and he does (or does not) ascribe some effect to experimental 
treatments. What with the shortage of publication space and the desire for ob- 
jectivity it often seems that the responsibility for rejecting a hypothesis rests 
squarely on a crucial value in a table of probabilities. 

The risk of choosing the incorrect inference from experimental observation 
depends on a stated risk of rejecting H, if true and on the risk of failing to do 
so if Ho is not true. Here is a dilemma which is dealt with in practice by two 
conventions. As Savage notes [7, p. 256] publications tend to report the results 
of the test as well as that level of significance for which the corresponding test 
of the relevant family would be on the borderline between acceptance and re- 
jection (in view the of the author). The individual reader now makes his own 
test at a level of significance appropriate to him. How much uncertainty such 
a reader is willing to tolerate in rejecting a hypothesis that might be true will 
depend on his confidence in the methods of data collection, his views concerning 
the relevance of alternative hypotheses, or the weight he gives to evidence 
from other sources. In addition, scientific readers differ in fundamental 
strategies for games against nature and their tolerance for errors can hardly be 
expected to remain unchanged from one experimental problem to another. The 
type of reporting mentioned by Savage may well be most satisfactory for author 
and reader alike. 

Some publications, notably of social science content, have adopted a some- 
what more extreme convention. Here a borderline between acceptance and re- 
jection of Ho is taken as a relatively fixed point, usually at Pr (E| Ho) <.05 or 





* The author wishes to express his thanks to Sir Ronald Fisher whose discussion on related topics stimulated 
this research in the first place, and to Leo Katz, Oliver Lacey, Enders Robinson, and Paul Siegel for reading and 
criticizing earlier drafts of this manuscript. 
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at that approximate region for which the probability, (Pr) of the outcome (£) 
of the experiment, calculated on the assumption that H, is true, is no larger 
than five in a hundred! [3] [6] [8]. General adherence to such a rigid strategy 
is interesting by itself but might have no further consequences on the decisions 
reached. However, when a fixed level of significance is used as a critical criterion 
for selecting reports for dissemination in professional journals it may result in 
embarrassing and unanticipated results. 


TABLE 31 


OUTCOMES OF TESTS OF SIGNIFICANCE FOR FOUR 
PSYCHOLOGY RESEARCH JOURNALS 


PUBLICATION DECISIONS 








Number of 
Number of 


Journals: All Issues From 
January To December 


Total Number 
of Research 
Reports 
(1) 


Number of 
Research Re- 
ports Using 
Tests of 
Significance 
(2) 


Research Re- 
ports that 
Reject Ho with 
Pr(E| Ho) <.05 
(3) 


Number of 
Research Re- 
ports that 
- Fail to 
Reject Ho 
(4) 


Research Reports 
That are Rep- 
lication of 
Previously 
Published 
Experiments 


(5) 





Experimental Psychology (1955) 106 0 

Comparative and Physiological 
Psychology (1956) 

Clinical Psychology (1955) 


Social Psychology (1955) 


94 
62 
32 


91 0 
59 < 0 
31 0 


81 
39 





Total 362 294 286 0 




















Table 31 shows that for psychological journals a policy exists under which 
the vast majority of published articles satisfy a minimum criterion of signifi- 
cance. The table summarizes the number of research articles in four publica- 
tions. The journals were selected at random from four major areas of psy- 
chology. The table gives the distribution for the number of reports that used 
tests of significance to test Ho and either rejected Hy or failed to do so at 
Pr (# | H) <.05. In addition the table gives the number of experiments that 
were replications of previously published investigations. Column 1 gives the 
number of experimental research reports and column 2 gives the number of 
those reports that used tests of singificance to choose among possible alternative 
hypotheses. Column 3 shows how many of the reports of column 2 managed to 
reject Hy and column 4 counts the number of reports that failed to reject Ho 
(either for the major hypothesis tested or for the majority of hypotheses under 
investigation.)* Finally, column 5 gives the number of experiments representing 
a replication of work previously reported in the literature. 





1 The fact that some tables present only the .05 and .01 levels of significance encourages the use of these two 
levels of significance [8, p. 292]. 

2 Some explanatory remarks concerning Table 31 are in order. Almost all of the 294 studies that used tests of 
significance were of a multivariable design. All evaluated observed differences against the assertion of Ho, however, 
Ho was sometimes not rejected for all variables tested. The following rules were adopted in compiling Table 31: 

a. The attempt was made to determine the major variable or prediction tested by the research design. Such was 
usually clear from the author’s preliminary remarks; the multivariable design was most frequently used to 
control for conditions not covered by the experimental procedure. The level of significance for which Ho was 
rejected for the major prediction was noted (if Ho was rejected at all). 

. If the design tested two or more variables for which no unambiguous decision as to major importance could 
be made, the lowest level of significance for which at least half the variables rejected He was noted. If Ho was 
not rejected for at least half the variables, the article was placed in the class of studies for which Ho was not 


rejected. 
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Table 32 shows the same distributions as proportions of columns 1 and 2. 

A glance at the tables is sufficient to show that most articles published during 
the year by the journals in question used tests of significance as aides in choos- 
ing among alternative experimental hypotheses and, at the same time, that 
nearly all managed to reject H» at the recommended level of certainty. It 
need not be assumed that the observed distributions are due to explicit edi- 


TABLE 32 


PER CENT OF ARTICLES USING TESTS OF SIGNIFICANCE 
AND PER CENT OF ARTICLES REJECTING Ho 








Per Cent of Per Cent of |Per Cent of Arti- 
Articles Using | Articles Reject- | cles Not Reject- 
Journals: All Issues From Tests of All ing Ho of All ing Hy of All 

January to December . Articles Articles Using Articles Using 
Published Tests Tests 
(2/1) (3/2) (4/2) 





Experimental Psychology (1955) 85.48 99 .06 
Comparative and Physiological 

Psychology (1956) 79.66 96.81 
Clinical Psychology (1955) 76.54 95.16 
Social Psychology (1955) 82.05 96.88 











Total 81.22 97.28 








torial rules. The single factor contributing most to the selection of articles in 
which Hp, is rejected may be implicit agreement among authors. The term 
“publication policy” will be used here largely as a matter of convenience. In 
fact, the distribution of articles in psychological journals in general appears to 
be similar to the ones shown in the table and it seems likely that the authors 
selection rather than editorial policy accounts for the observed profession-wide 
selection. Whatever the reasons, the tables indicate what gets printed with a 
high probability; namely, research reports that use tests of significance and at 
the same time reject H, for the effects of treatments in the design.* To state the 
above more concisely: 





c. Where results from more than one research design were reported, an attempt was made to determine the one 
study deemed most crucial by the author and the level of significance for that study was recorded if it rejected 
Ho. 

. If all studies seemed of equal importance, the lowest level of significance for which at least half the reported 
studies rejecved Ho was recorded. If Ho was not rejected for at least half the studies reported, the article was 
placed in the class of studies for which Ho was not rejected. (This special provision in 2 and 4 was not really 
necessary since for no single article were less than half of the quoted results in the significant category.) 

. Two studies that obtained Pr(Z| Hs) <.1 were included because the authors had expressly pointed out that 
they rejected Ho since the obtained significance level was close enough to the conventional .05 to suit their 
purposes. 

Since the Psychological Abstracts essentially attempt to present an outline of the major points made in almost all 
research articles of interest to psychologists the procedure used here could be checked for reliability with that pub- 
lication. Of 100 research articles selected at random from volumes covering 1952 to 1957, 94 reported positive results, 
5 reported negative results, and one wasa replication of a previous study. These proportions agreed by and large with 
the total proportions in Table 31. No comparison for use of tests of significance were made since that journal seldom 
reports results of statistical tests. However, the words “significantly different’ were applied to most of the reported 
results. 

* It is interesting that the Journal of Experimental Psychology appears to set the pace for the use of statistical 
tests as well as for the selection of articles that reject Ho. Some years ago the same journal was used [4] to show that 
x? was consistently misused by psycholegists. The authors noted at the time that analyses in this journal would be 
typical for psychological publications in general and that the expectation of finding sound statistical treatments 
would be better in that journal than in others. 
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A, Experimental results will be printed with a greater probability if the 
relevant test of significance rejects Hy» for the major hypothesis with 
Pr (E | H) <.05 than if they fail to reject Ho at that level. 

A, The probability that an experimental design will be replicated becomes 
very small once such an experiment appears in print. 


With respect to A;, it is not known how many research results either reject 
Hy or do not do so, or, are submitted or not submitted for publication. How- 
ever, it does seem clear [2] that pressure exists which leads to the selection of 
a very small number of publications from a large number of submitted manu- 
scripts. From a commonly admitted tendency to acknowledge only the most 
significant findings, and from perusal of statements concerning publication 
pressures [2], one could infer another reasonable assumption: 


A; A great many more experiments are performed than appear in the pages 
of professional journals. 


With respect to Ae, the lack of replication of experimentation in psychology 
has been noted elsewhere [5]. Replications are sometimes reported at profes- 
sional meetings. Since such papers are rarely used as references unless they 
have been published they may be ignored as sources for widespread professional 
or scientific information. 

The three assumptions are admittedly substantive in nature and strong sup- 
porting evidence for them, beyond that given here, is hard to come by. They 
may be taken as a fair statement of the prevailing conditions in which the 
scientific community is not equally aware of all experimental results. As a con- 
sequence, experiments for which Pr (E| H,) is large may well have a high fre- 
quency of replication by individuals who do not know that this particular 
comparison had been made previously, and that previous tests of significance 
had failed to reject H» at acceptable levels of significance. Once a study does 
result in a level of significance that meets this criterion, not only will it be 
published, but the likelihood of its ever being repeated appears to become very 
small. A picture emerges for which the number of possible replications of a test 
between experimental variates is related inversely to the actual magnitude of 
the differences between their effects. The smaller this difference the larger may 
be the likelihood of repetition. This chain is terminated apparently by an ob- 
servation for which the relevant statistical test can reject Ho with reasonable 
certainty. For any set of observed differences that are randomly variable (and 
which experimental observations are not?) a difference of some substance 
should then appear in print—irrespective of the actual state of nature. What 
credence can then be given to inferences drawn from statistical tests of Ho if 
the reader is not aware of all experimental outcomes of a kind? Perhaps even 
more pertinent is the question: Can the reader justify adopting the same level 
of significance as does the author of a published study? 

Two points are worth noting with respect to the last two questions. Both 
refer to the expectations a reader may form when he picks up an article in one 
of the journals of Table 31 (or in a journal following like practices). 

First the reader’s best expectation is that the author will reject Ho. The 
probability that he will commit a Type II error (accepting the null hypothesis 
when it is false) if he adopts the author’s conclusion is, in consequence, ex- 
tremely small. In fact, from Table 31 it appears that this risk is scarcely more 
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than zero One may therefore conclude that any and all tests used by authors 
are of equally high power for the reader. This obviously was not true for the 
individual investigator who attempted to choose the most powerful test in the 
first place. 

There is also another side to this problem. The reader’s expectations are 
that Ho will be rejected. What risks does he take in making a Type I error by 
rejecting Ho with the author? The author intended to indicate the probability 
of such a risk by stating a level of significance. On the other hand, the reader 
has to consider the selection that may have taken place among a set of similar 
experiments for which the one that obtained large differences by chance had 
the better opportunity to come under his scrutiny. The problem simply is that 
a Type I error (rejecting the null hypothesis when it is true) has a fair oppor- 
tunity to end up in print when the correct decision is the acceptance of Hy for a 
particular set of experimental variables. Before the reader can make an intel- 
ligent decision he must have some information concerning the distribution of 
outcomes of similar experiments or at least the assurance that a similar experi- 
ment has never been performed. Since the latter information is unobtainable 
he is in a dilemma. One thing is clear however. The risk stated by the author 
cannot be accepted at its face value once the author’s conclusions appear in 
print. It may be sefe to conclude that pursuing statistical analyses under the 
conditions outlined here may have considerable less merit than psychologists 
like to ascribe to statistics in experimental design. 

It would be unfair to close with the impression that the malpractices dis- 
cussed here are the private domain of psychology. A few minutes of browsing 
through experimental journals in biology, chemistry, medicine, physiology, or 
sociology show that the same usages are widespread through other sciences. 
Some onus appears to be attached to reporting negative results. Certainly such 
results occur with lesser frequency in the literature than they may reasonably 
be expected to happen in the laboratory—even if it is assumed that all experi- 
menters are outstandingly clever in selecting hypotheses. Perhaps the trend of 
our time is exemplified by the editors of a cancer journal who in a recent an- 
nouncement took action to change the name of their yearly supplement from 
“Negative Data...” to “... Screening Data” [1, p. 619]. 
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* This was pointed out to me by Charles Stevens of the Kettering Laboratory, University of Cincinnati. 





INHALATION IN RELATION TO TYPE AND 
AMOUNT OF SMOKING* 


E. Cuyter HamMonp 
American Cancer Society 


A survey conducted by mail was made to obtain information on in- 
halation in relation to type and amount of smoking. The proportion of 
men who said that they inhaled: (1) increased with amount of smoking 
and decreased with age, (2) was very much higher for cigarette smokers 
than for cigar and pipe smokers, and (3) was much higher for men who 
smoked only cigarettes than for men who smoked both cigarettes and 
cigars. The proportion of men who said that they inhale differed very 
little between those smoking filter tip cigarettes and those smoking non- 
filter tip cigarettes. 

A test was made to determine whether the wording of the letter of 
transmittal enclosed with the questionnaires, the organization from 
which the questionnaires were sent, the presence or absence of a postage 
stamp on the envelope enclosed for reply, or the failure of some men to 
reply had an influence on the findings. It appeared that these factors 
made very little difference in the percentage distribution of responses 
to questions on smoking habits. However, a larger percentage of the ad- 
dressees replied when a return envelope with s postage stamp attached 
was enclosed than when a business reply envelope not requiring a post- 
age stamp was enclosed. The wording of the letter of transmittal also 
seemed to have some influence on the percentage of replies. 


BACKGROUND 


N A recent prospective study of white men between the ages of 50 and 69 
i] [6] it was found that the total death rate of those who smoked cigarettes 
only was considerably higher than the total death rate of those who smoked 
pipes or cigars but did not smoke cigarettes. This was found among men living 
in rural areas as well as among men living in urban areas. Standardized for 
age and amount of cigarette smoking, the death rate of men currently smoking 
cigars as well as cigarettes was about 29% lower than the death rate of men 
smoking cigarettes only. Smokers of every type had total death rates higher 
than those of men of the same age who never smoked. However, there was only 
a slight difference in total death rates between pipe smokers and non-smokers. 

The term “inhalation” as used in this report means drawing tobacco smoke 
into the lung. Some smokers do so deliberately, others perhaps do so uncon- 
sciously, and still others do so little, if at all [7]. 

If a smoker inhales, his lungs are directly exposed to tobacco smoke and a 
considerable proportion of the nicotine [4] and carbon monoxide are absorbed 
inte his blood. On the other hand, a smoker who does not inhale has very little 
exposure to tobacco smoke products except for the epithelium of the lip, 
tongue, mouth, pharynx, and esophagus. Death rates from cancer of these 
tissues (taken as a group) were found [6] to be far higher among cigar smokers, 
pipe smokers, and cigarette smokers than among non-smokers. 





* From the Statistical Research Section of the American Cancer Society and the Statistical Laboratory of Yale 
University. 
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These findings suggest the hypothesis that the differences in death rates 
between smokers of various types might be due to differences in amount of 
inhalation. This applies to death rates from a number of different diseases, not 
just cancer. The main objective of the present study was to determine whether 
this is a tenable hypothesis. If apparently tenable, further research will still 
have to be done to determine whether or not the hypothesis is correct. R. A. 
Fisher [3], among others, has suggested that the problem of inhalation is criti- 
cal to an understanding of the possible effects of smoking. 


PART I: 1. SOME POSSIBLE SOURCES OF BIAS 


When questionnaires are sent out by mail, only a certain proportion of the 
addressees reply and there is always the possibility that the non-responders 
differ from the responders in respect to the characteristics being investigated. 
Furthermore, there is a possibility that the wording of the letter of transmittal 
as well as the identity of the investigator can have an influence on the answers. 

Many people are more or less aware that the American Cancer Society has 
reported an association between smoking habits and lung cancer. Furthermore, 
the very word cancer raises an emotional response in some people. Conceivably, 
this could have introduced a bias both as to which addressees replied and as 
to answers concerning their smoking habits. Therefore, an attempt was made 
to determine whether such bias would in fact occur in the study to be under- 
taken. 


2. DEVELOPMENT OF QUESTIONNAIRE 


It has been shown [2] that information on current type and amount of smok- 
ing can be obtained by questionnaire with a reasonable degree of accuracy (at 
least to the extent of distinguishing between light smokers and heavy smokers). 
Few attempts have been made to obtain information on inhalation from large 
numbers of people. After trying out a number of different forms of questioning, 
the following was selected as appearing to be the most satisfactory: 

How much do you inhale when smoking cigarettes? (check one): 

Do not inhale (1); Inhale slightly (J; Inhale moderately (); Inhale deeply (1. 

The same question, with the change of one word, was used for pipe smokers 
and for cigar smokers. I have been unable to think of a fully reliable way to test 
the accuracy of the responses. However, on observing a number of people, those 
who did not appear to inhale checked “do not inhale” or “inhale slightly” and 
those who appeared to inhale checked “inhale moderately” or “inhale deeply.” 

A simple questionnaire was designed asking age, sex, whether the individual 
has ever smoked cigarettes, pipes, or cigars regularly and whether he now 
smokes one or more of these types. (See figure 37). Those saying that they 
now smoke cigarettes are asked: “About how many packs of cigarettes do you 
smoke a day?,” the question on inhalation, and “About how much of each 
cigarette do you usually smoke?” Those saying they now smoke pipes are 
asked “About how many pipefuls of tobacco do you smoke a day?” and the 
question on inhalation. Comparable questions are asked of cigar smokers. 
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The questionnaire was pretested on 53 subjects and appeared to be satis- 
factory except for the question on the amount of each cigarette usually smoked. 
The subject was asked to indicate this by marking a diagram of a cigarette 
which was printed on the questionnaire. In order to test the validity of the 
answers to this question, cigarette butts were collected from ash trays used by 
each of the 53 subjects. It was found that the length of the butts left by an 
individual is highly variable, probably depending upon interruptions while 

Age: Sex: = Check _ one: 


Male[] Female [() 


Have you ever amoké¢d cigarettes regularly? 
De you now smoke cigarettes? 


Have you ever smoked a pipe regulerly? 
Do you now smoke a pipe? 


Have you ever smoked cigars regularly? 
Do yop now smoke cigars? 





If you now smoke cigarettes: 
+ About how maay packs of cigerettes do you smoke a day? 


+ What type? (Check one): Filter tip[J or Non-filter tipl ] 
+ How much do you inhale when smoking cigarettes? (Check one) 
Do not inhale(] ; Inhale slightly(] ; Inhale moderately (] ; Inhale deeply(J 
« About how much of a cigarette do you usually smoke? (Please 5 
indicate this by marking on the diagram below how much of 
your cigarette is usually left when you put it out): 


( o* aa 
a 2 











If you now smoke a pipe: 
« About how many pipefuls of tobacco do you smoke a day? 
+. How much do you inhuie when smoking a pipe? (Check one) 
Do not inhale[] ; Inhale slightly() ; Inhale moderately(] ; Inhale deeply] 





If you now smoke cigars? 


+ About how many cigars do you smoke a day? 
+ How auch do you inhale when smoking cigars? (Check one) 
Do not inhale[) ; Inhale slightly() ; Inhale moderately () ; Inhele deeply) 


Fia. 37 


smoking as well as many other factors. The correlation coefficient between the 
length indicated on the questionnaire and the average length of the butts was 
only +.45. Conceivably, the length as indicated on the questionnaire is a bet- 
ter estimate of the individual’s usyal habits than the average length of the butts 
left during the preceding few hours. However, the test seemed to indicate such 
poor validity that no confidence could be placed in the questionnaire responses 
to this question. Nevertheless, the question together with the diagram of the 
cigarette was left on the questionnaire because it seemed to interest the sub- 
jects and contribute to better cooperation. 
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3. SELECTION OF SAMPLES 


In order to obtain a sufficient number of cases in each of the several categories 
by type of smoking, the total sample had to be large. By preference, we would 
have studied a random sample of the United States male population, but this 
is very costly. Therefore, samples were drawn from telephone directories in 
such a way as to be proportional to the geographic distribution of the popula- 
tion of the United States. A large proportion of men in this country over the 
age of 30 must now be listed in telephone directories, there being: (1) about 
42 million men over 30 in the United States, and (2) about 45 million home 
telephones, a majority of which are listed under the names of men over 30. 
Telephone subscribers are not a perfectly random sample of the total adult 
male population since the lowest economic groups are probably under-repre- 
sented and people residing in institutions of various sorts are seldom sub- 
scribers. However, the sample was considered to be satisfactory for the pur- 
pose at hand since the same sorts of people were probably under-represented in 
the study of smoking in relation to death rates referred to earlier [6]. Thus the 
two studies should be roughly comparable in this respect. 

The selection of names from telephone directories was carried out as follows: 

The number of names to be drawn from each section of the country was pre- 
determined so as to be proportional to the male population as reported in the 
1950 census of the United States. Within each section, the number of names to 
be drawn from rural areas and from cities and towns of various sizes was made 
proportional to the population of that section as of 1950. A number of tele- 
phone directories were selected from each section with due consideration to the 
factors mentioned above and the number of names to be drawn from each 
directory was determined accordingly. The number of names to be drawn from 
any single directory was less than the number of pages in the directory, so just 
one name was selected from a particular page. If, for example, 50 names were 
to be selected from a directory with 500 pages, one each was taken from pages 
1, 11, 21, 31, ete. The name selected from a particular page was the first which 
appeared to be that of an individual male subscriber (i.e., corporation names 
and names appearing to be female were skipped). 


4. TESTS FOR POSSIBLE BIAS 
The First Test: 


A sample of 1,977 male names was selected as described above and divided 
by alternate names into two equal groups. Questionnaires were mailed to one 
of these groups from the American Cancer Society with a covering letter clearly 
indicating that we are interested in the relationship between smoking and 
cancer. A deliberate attempt was made to raise an emotional reaction by start- 
ing the letter with the words: “Last year cancer killed 245,000 Americans. . . .” 
(See Appendix A.) The questionnaire was printed on the back of the covering 
letter and the letters were individually signed in ink. A business reply envelope 
requiring no stamp was enclosed with each questionnaire. 

Questionnaires were mailed to the second group from the Yale Statistical 
Laboratory. The covering letter avoided any mention of cancer or of health 
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and gave the impression that we were interested in tobacco sales in relation to 
advertising and publicity (which is indeed one of our interests). (See Appendix 
B.) Enclosed for reply was an envelope addressed to the Yale Statistical Labo- 
ratory with a stamp attached. All of the questionnaires in both groups were 
mailed o1 November 15, 1957, the Yale group being sent from New Haven and 
the American Cancer Society group being sent from New York. 

On December 6, 1957 a second questionnaire was mailed to every subject 
who failed to reply the first time (excluding cases where the first letter was 
returned by the post office as undeliverable). Except for the letterhead, the 
covering letter (which made no mention of cancer or health) was the same for 
the two groups. (See Appendix C.) This time, an addressed envelope with a 
stamp attached was enclosed with questionnaires mailed from the American 
Cancer Society as well as those mailed from Yale. 


The Second Test: 


A sample of 4,015 male names was selected in the same way as before and 
divided into three groups. A covering letter was written with no mention of 
cancer or health. (See Appendix D.) Questionnaires with this covering letter 
and a business reply envelope requiring no stamp were sent from the American 
Cancer Society to 1,004 men. Questionnaires with the same covering letter but 
a reply envelope with a stamp attached were sent from the American Cancer 
Society to another 1,004 men. Questionnaires with the covering letter used for 
the first Yale test (see Appendix B) and a reply envelope with a stamp attached 
were sent from the Yale Statistical Laboratory to 2,007 men. The question- 
naires were mailed on January 29, 1958. 

Finally, on February 24, 1958, questionnaires were sent to all the men in the 
three groups who had not replied to the earlier letter (excluding cases where 
the first letter was returned by the post office as undeliverable). The same 
follow-up letter as before (see Appendix C) was used for all three groups and a 
reply envelope with a stamp attached was enclosed with every questionnaire. 


5. RESULTS OF TESTS 
Per Cent Returns: 


Table 40 shows, for each of the mailings described above, the number of 
questionnaires sent out, the number returned by the post office as undeliver- 
able, and the number returned by the addressees (or a member of the addres- 
see’s family in a few cases). Addresses given in telephone directories are some- 
times so abbreviated as to be unsatisfactory for mail. This appears to have been 
the most common difficulty in those cases where the letter was returned by the 
post office as undeliverable. Other difficulties included out of date addresses 
and some clerical errors in copying. It seems unlikely that this factor (i.e., 
unsatisfactory mailing address) would introduce a bias of any sort in the re- 
turns. Therefore, it seems most reasonable to discuss the replies in relation to 
the number of questionnaires which were presumably delivered to the addres- 
sees. 

The first A.C.S. test differed greatly from the first Yale test as to the wording 
of the covering letter (see Appendix A and Appendix B) and also differed as 
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to the presence or absence of postage stamps on the reply envelopes used for the 
first mailings. The percentage of replies from the first mailings were 27.6% for 
the A.C.S. sample and 40.0% for the Yale sample. (See Table 100.) The same 
letter (except for the letterhead) was used in the second mailing for these two 
samples and the percentage replies were 43.3% and 53.5% respectively. The net 
percentage replies from the two mailings combined were 59.2% and 72.7% 
respectively. It appears that a combination of factors (e.g. letter of transmittal, 
stamp or no stamp on reply envelope, sponsorship, etc.), can have a consider- 
able influence on the percentage of replies to inquiries sent out by mail. 


TABLE 40 


NUMBER OF QUESTIONNAIRES MAILED OUT AND NUMBER OF 
REPLIES RECEIVED FROM EACH OF SEVERAL MAILINGS 








Replies Received 
No. 
Covering Letter* Mailed % 
Out No. of 
Total 








(A) Cancer emphasized 252 
(B) Advertising and publicity 372 
(C) Request for reply 278 
(C) Request for reply 282 
See above, (A) and (C) 530 
See above, (B) and (C) 654 


SSSES5 
CoPmOQrae 


(D) Cancer not mentioned 
(D) Cancer not mentioned 
(B) Advertising and publicity 
(C) Request for reply 

(C) Request for reply 

—_ C8. See above, (D) and (C) 
— See above, (B) and (C) 


SSSSSss 
SCORacoea 
SSSSESS 


NOQan |} » 


Grand 
Tora 


























oa 
_ 
oo 
a 
a) 
=) 





* See appendix for copies of letters A, B, C, and D. 
t+ Number of people. 

1 One-half of second A.C.S. test. 

2 Other half of second A.C.S. test. 


The first and second tests made from Yale were identical as to covering let- 
ters and method of drawing the samples. They differed only as to date (the first 
starting on November 15, 1957 and the second starting on January 29, 1958). 
As shown in Table 100, the percentage replies received was very nearly the same 
for these two groups (i.e., 40.0% and 41.2% respectively). Therefore, it appears 
that the difference in dates had little or no influence on the percentage replies. 
The results might have been quite different at some other time of the year such 
as the Christmas season or the summer vacation period. 

The first A.C.S. test and one half of the second A.C.S. test were alike to the 
extent that business reply envelopes (with no stamp but with return postage 
guaranteed) were used for the first mailing in both instances. However, they 
differed greatly in respect to the wording of the covering letter. (See Appendix 
A and Appendix D.) The percentage replies from the first mailings were 27.6% 
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and 33.3% respectively. This difference is statistically significant (P =.008). 
It appears that the wording of the letter of transmittal is of some importance. 
In this instance, a simple request for information brought a higher percentage 
of replies than a letter written in the way which an appeal for funds to support 
cancer research might be written. 

The same covering letter was used for the two halves of the second A.C.S. 
test. (See Appendix D.) They differed only to the extent that a business reply 
envelope was used for the first half and a return envelope with a stamp at- 
tached was used for the second half. The percentage replies from the first mail- 
ings were 33.3% and 42.6% respectively. This difference is statistically signifi- 
cant (P <.001). 

The second half of the second A.C.S. test and the second Yale test were alike 
to the extent tha: »ostage stamps were placed on the reply envelopes. They 
differed astospons : hip (i.e., A.C.S. and Yale), as to title of person signing the 
letter (the same person in both instances) and as to the wording of the covering 
letters. However, the A.C.S. letter made no mention of cancer (except in the 
letterhead) and an attempt was made to avoid any emotional appeal. (See 
Appendix D.) The percentage replies from the first mailings were about the 
same for the two samples, being 42.6% for the A.C.S. sample and 41.2% for the 
Yale sample. 

The replies from the second mailing of the second Yale test amounted to 
52.3%. The two halves of the second A.C.S. test were merged for the second 
mailing, so separate figures are not available. The percentage replies from the 
two halves together in the second mailing was only 39.4%. 


Answers to Questions on Smoking Habits: 


The percentage of replies is certainly a matter of concern to an investigator 
if for no other reason than that it affects the cost of a study. However, a low 
percentage return does not necessarily indicate that there is bias in answers to 
various questions. 

Altogether, questionnaires were mailed ‘to 5,992 people. The returns were: 
3,705 (61.8%) answered questionnaires; 461 (7.7%) returned by post office 
undelivered; and 1,826 (30.5%) no return. Thus, of the 5,531 people who pre- 
sumably received our letters, 67.0% replied. Of the 3,705 answered question- 
naires, 60 were answered by women, 79 stated that the addressee was dead, 
and 6 were too poorly filled in to be usable. This left 3,560 for analysis. It should 
be noted that occasionally a question or two was left unanswered and in such 
instances these questionnaires were omitted from certain tabulations in order 
to avoid undue complexity. For this reason, there are slight differences in the 
totals shown on different tables that follow. 

Table 42 shows the distribution of men by current smoking habits for sev- 
eral different groupings. There was not much difference between the Yale 
samples and the American Cancer Society samples and not much difference 
between those who replied to the first letter and those who failed to reply to 
the first letter but replied to the second letter. There was only a relatively small 
difference between replies received in the latter part of 1957 compared with the 
early part of 1958. 
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The distribution of answers to the questions on inhalation also showed little 
difference between the Yale and American Cancer Society samples; between 
replies to the first and second letters; and between the 1957 sample and the 
1958 sample. What is more important, the relationship between inhalation and 
type of smoking was about as close in these several samples as could have been 
expected considering the number of cases in each group. This being so, all of 
the questionnaires received were grouped together for further analysis. 


TABLE 42 


DISTRIBUTION OF MEN BY CURRENT TYPE OF SMOKING; COMPARISON 
OF DIFFERENT SAMPLES AND COMPARISON OF FIRST MAILING 
RETURNS WITH SECOND MAILING RETURNS 








ist and 2nd A.C.8. and Yale 
Mailings Combined Samples Combined 





Current Type of Smoking A.C.S. Yale Ist 2nd 
Sample Sample Mailing Mailing 





No, % No. % No. y ‘ % 





Survey—Nov. Dec. 1957 
Never Smoked Regularly 
Ex-Regular Smokers 
Cigarette Only 

Cigarette and Other 
Pipe Only 

Cigar Only 

Pipe and Cigar 


Total 








Survey—Jan. Feb. 1958 
Never Smoked Regularly 
Ex-Regular Smokers 
Cigarette Only 

Cigarette and Other 
Pipe Only 

Cigar Only 

Pipe and Cigar 


Total 



































Haenszel, Shimkin, and Miller [5] have reported on the smoking pattern in 
a large random sample of the United States population. The survey was con- 
ducted by the Bureau of the Census as a supplement to its February 1955 Cur- 
rent Population Survey. Of men in age group 25-34, 19.0% in the Census Bu- 
reau survey and 18.9% in this study said that they had never smoked. The cor- 
responding figures were 18.7% and 17.2% respectively in age group 35-44; 
17.1% and 17.4% respectively in age group 45-54; and 18.1% and 21.1% 
respectively in age group 55-64. The greatest difference, which amounted to 
only 3.0%, was in age group 55-64 and this was not statistically significant 
(P =.11). The correspondence is remarkably close considering the difference in 
years, the difference in sampling procedures, and the differences in the question- 
naires. 
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6. DISCUSSION 


The writer is fully aware of the danger that biasing factors of one sort or 
another may influence the results of a study of this type. A test was made by 
deliberately attempting to introduce a difference in answers to questions on 
smoking habits in two matched samples: (1) by a great difference in the word- 
ing of the covering letters, (2) by mailing one set of questionnaires from a volun- 
tary health agency and the other set from the statistical laboratory of a uni- 
versity, and (3) by attaching postage stamps to the return envelopes for one 
set but not the other set. There was a considerable difference in the per cent of 
men who replied. However, there was only a slight difference in the percentage 
distribution of answers to questions on smoking habits. This does not prove 
that the answers were entirely unbiased; conceivably, both samples were biased 
in exactly the same way and to the same extent. Nevertheless, it gives some 
measure of confidence in the findings. 

The fact that 33% of those who presumably received our letters did not reply 
leaves the possibility that the responders were not a truly representative 
sample of the entire group in respect to smoking habits. However, two facts 
suggest that this did not have a great influence on the findings: (1) the smoking 
habits as stated by those who replied to the first letter mailed to them were 
about the same as the smoking habits as stated by those who failed to reply to 
the first letter but did reply to a second letter, and (2) the proportion of non- 
smokers in the group that replied was in close agreement with the proportion 
of non-smokers previously reported for a random sample of the United States 


population of the same age and sex in a survey made by the Bureau of the 
Census. 


In summary, it appears that factors such as the wording of the covering letter 
had an effect on the percentage returns but made little or no difference in the 
percentage distribution of answers to questions on smoking habits. 


PART IL: 7. FINDINGS ON INHALATION 


Table 44 shows the number and percentage distribution of answers to the 
questions on inhalation by current type of smoking for three age groups. When 
a person who smoked two or three different types (e.g., both cigarettes and 
pipes) indicated that he inhaled more when smoking one type than when smok- 
ing another, he was classified in this table according to the greatest amount of 
inhalation recorded. 

It is apparent that cigarette smokers, as a group, inhale far more than either 
pipe or cigar smokers. Among men who smoke just one type, the percentage 
who said that they do not inhale was 4.0 for cigarette smokers, 54.2 for pipe 
smokers, and 72.3 for cigar smokers. The percentage who said that they inhale 
deeply was 30.1 for cigarette smokers, 3.8 for pipe smokers, and 1.3 for cigar 
smokers. The older men reported less inhaling than the younger men; but in all 
three age groups (i.e., under 40, 40-59, 60 and older) cigarette smokers re- 
ported far more inhaling than pipe or cigar smokers. 

Men who smoked cigarettes and cigars (but not pipes) reported far less 
inhaling than men who smoked cigarettes only. Of those in the cigarette-only 
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TABLE 44 


AMOUNT OF INHALATION BY CURRENT TYPE OF SMOKING 
AND BY AGE 








Inhalation 





Age Group} Current Type of Smoking Slightly Moderately 





No. % No. 





Cigarette Only Z 156 9. 936 
Cigarette & Pipe . ll 11. 85 
Cigarette, Pipe & Cigar ‘ 13 30. 13 
Cigarette & Cigar f 22 29. 23 
Pipe Only . 31 23. 2 

Pipe & Cigar . 21 25. 5 
Cigar Only J 31 19. ll 





Cigarette Only . 6. 
Cigarette & Pipe 15. 
Cigarette, Pipe & Cigar . 47. 
Cigarette & Cigar y . 30. 
Pipe Only ‘ 30. 
Pipe & Cigar a 40. 
Cigar Only . 26. 





Cigarette Oniy y , 9. 
Cigarette & Pipe . 10. 
Cigarette, Pipe & Cigar 22. 
Cigarette & Cigar ‘ 19. 
Pipe Only " 28. 
Pipe & Cigar J 27. 
| Cigar Only y d 22. 








Cigarette Only ‘ 20. 
Cigarette & Pipe q 11. 11. 
Cigarette, Pipe & Cigar y 16. 16. 
Cigarette & Cigar 30. 46.5 
Pipe Only ‘ 71. 20. 
Pipe & Cigar 77. 22. 
Cigar Only q 78. 6. 











Cigarette Only 195 6. 9. 61.0 
7 


Cigarette & Pipe 13 0 ; 61.5 
Cigarette, Pipe & Cigar 5 20. 20. Q 60.0 
Cigarette & Cigar ll 18. 36. 45.5 
Pipe Only 17 12 70. 5.¢ 4 17.6 
Pipe & Cigar 13 12 92. £ 0 

Cigar Only 24 18 75. 16. 8.3 



































category, 13.4% said that they inhale none or slightly and 86.6% said that they 
inhale moderately or deeply. Of those in the cigarette-cigar cateogry, 50.6% 
said that they inhale none or slightly and 49.4% said that they inhale moder- 
ately or deeply. As shown by figures given on table 45, men in the cigarette- 
cigar category inhale even less than light smokers (i.e., under 1 pack of cigar- 
ettes a day) in the cigarette-only category. 

There was little difference in amount of inhalation between men in the cig- 
arette-pipe category and men in the cigarette-only category. 

Table 45 shows, for men currently smoking cigarettes only, the number and 
percentage distribution of answers to the question on inhalation by average 
number of packs of cigarettes smoked per day for three age groups. Two things 
are apparent: (1) Within each of the three age groups, the proportion of men 
who say that they do not inhale goes down and the proportion of men who say 





INHALATION 
TABLE 45 


INHALATION BY AMOUNT OF CIGARETTE SMOKING AND AGE 
AMONG MEN WHO SMOKE ONLY CIGARETTES 








Packs of Inhalation 

Ciga- 
Age Group] rettes Slightly Moderately 
per 
Day " No. No. 











Under 1 ; 45 : 
About 1 ‘ 77 : 472 
Over 1 ; 29 ; 


Total 





Under 40 | Under 1 
About 1 
Over 1 


Total 





Under 1 
About 1 
Over 1 


Total 





Under i 
About 1 
Over 1 


Total 





Age Not Under 1 
Stated About 1 
Over 1 
































Total 





that they inhale deeply goes up with amount of cigarette smoking. (2) Within 
each of the three levels of amount of smoking, the proportion of non-inhalers 
goes up and the proportion of deep inhalers goes down with advancing age. 
Those reporting the greatest amount of inhalation were men under 40 who 
smoke more than a pack of cigarettes a day (less than 1% said that they do 
not inhale and about 50% said that they inhale deeply). Those reporting the 
least amount of inhlation were men in age group 60 and older who smoke less 
than one pack of cigarettes a day (29.4% said that they do not inhale and only 
9.8% said that they inhale deeply). 

Table 46 shows the number and per cent of non-filter tip and filter tip 
smokers among men who smoke cigarettes only by age and by amount of smok- 
ing. Table 47 shows the number and per cent of non-filter tip and filter tip 
smokers who said that they inhale either moderately or deeply and those who 
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TABLE 46 


DISTRIBUTION OF MEN WHO SMOKED CIGARETTES ONLY BY TYPE OF 
CIGARETTE, BY AGE, AND BY DAILY AMOUNT OF CIGARETTE SMOKING 








Packs of 
Age Ciga- 

Group rettes 

per Day No. % . % 


Non-filter Tip Filter Tip 








Total Under 1 153 49. 51. 
About 1 58. 41. 
Over 1 306 57. 42. 


Total 905 56. 43. 
Under 1 59 48. 51. 
About 1 191 54. 45. 
Over 1 59. 40. 








Total 55. 44. 





Under 1 47. 52. 
About 1 59. 40. 
Over 1 55. 44. 


Total 56. 43. 





Under 1 49 27 55. 22 44. 
About 1 57 33 57. 24 42. 
Over 1 37 24 64. 13 35. 


Total 143 84 58. 59 41. 





Age Not | Under 1 37 17 45. 20 54. 
Stated | About 1 80 56 70. 24 30. 
Over 1 65 35 53. 30 46. 


Total 182 108 59. 74 40. 























said that they do not inhale or inhale slightly. Twenty-eight men who did not 
check type of cigarette (or checked both types) are omitted from these tables. 

The sale of filter tip cigarettes has been increasing rapidly during the last 
several years. In this sample, 43.7% of the cigarette smokers said that they 
smoked filter tip cigarettes. Age seems to make little difference. However, 
men smoking less than one pack of cigarettes a day seem to have a somewhat 
greater preference for filter tips than do heavier smokers. 

The proportion of men who inhale does not appear to differ greatly between 
non-filter tip smokers and filter tip smokers. 


8. DISCUSSION 


In 1950, Doll and Hill [1] published the results of a retrospective study in 
England on smoking in relation to lung cancer. The subjects were classified by 
average number of grams of tobacco smoked per day, no distinction being made 
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TABLE 47 
AMOUNT OF INHALATION OF MEN WHO SMOKED FILTER AND NON-FIL- 


TER TYPE CIGARETTES ONLY BY AGE AND BY DAILY 
AMOUNT OF CIGARETTE SMOKING 











Non-Filter Tip Filter Tip 





Packs of 
Ciga- Inhale Inhale Inhale Inhale 
rettes None or Moderately None or Moderately 

per Total) Slightly or Deeply |Total| Slightly or Deeply 
Day RUN SECRET: 
No. % No. % No. % No. % 








Under 1 36 §=623.6 | 117 5 30. 69. 
About 1 62 13.9 | 384 5 | 34 =10. 286 = 889. 
Over 1 19 6.2 8 14 6. 93. 








Total 117 12.9] 788 87.0 96 . 86. 
Under 1 6 0.2 53s 89. 11 é 82. 
About 1 24 ; 167 —=— 87. 12 ; 92. 
Over 1 3 2. , 2 H 97. 








Total 33 ; 91. . 92. 
Under 1 12 ‘ 38 76. ‘ 63. 
About 1 22 , 144 86. . 93. 
Over 1 9 ; 91. , 90. 








Total 43 ; 86. x 86. 











Under 1 27 : 15-55. ‘ 12 54. 
About 1 33 , 2369. . 12 50. 
Over 1 24 ‘ 22 «91. ‘ ll 84. 


Total 84 F 60 71. , 35 59. 





Age Not | Under 1 17 5.3 11 64. : 13 65. 
Stated About 1 56 , 50 = 89. : 21 +87. 
Over 1 35 14.3 85. : 28 8693. 























Total | 108 1715.7 91 84.3 12 : 62 83.8 





between different types of smoking (i.e., cigarette, pipe, and cigar). They found 
a high degree of association between lung cancer and number of grams of tobac- 
co smoked per day. However, disregarding type of smoking, amount of smok- 
ing, and age, the proportion of smokers who said that they inhaled was reported 
to be slightly smaller in the lung cancer group than in the control group. This 
“anomalous” result was puzzling since, as the authors said: “It would be nat- 
ural to suppose that if smoking were harmful it would be more harmful if the 
smoke were inhaled.” 

Recently, Schwartz and Denoix [8] have published the results of a retro- 
spective study in France in which type of smoking, amount of smoking, and 
inhalation were all taken into account and reported in considerable detail. 
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Holding amount of cigarette smoking constant, they found that the “relative 
risk” of lung cancer was considerably greater among smokers who said that 
they inhaled than among smokers who said that they did not inhale. Among 
subjects who said that they did not inhale, the “relative risk” of lung cancer 
increased rapidly with amount of cigarette smoking. The proportion of subjects 
who said that they inhaled: (1) increased with amount of cigarette smoking; 
(2) was greater among subjects who smoked cigarettes only than among sub- 
jects who smoked pipes as well as cigarettes; and (3) was many times higher 
among cigarette smokers than among pipe smokers. 

It is not clear why the findings in respect to inhalation were different in the 
two studies discussed above. 

The fact that, holding amount of cigarette smoking constant, men who 
smoked cigars as well as cigarettes had lower death rates than men who 
smoked only cigarettes was the most puzzling finding in a recent prospective 
study of smoking in relation to death rates [6]. One possible explanation was 
that men in the cigarette-only category tend to inhale to a greater degree than 
men in the cigarette-cigar category. The findings of the present study seem to 
support this hypothesis in light of the findings of Schwartz and Denoix on the 
relationship of inhalation to lung cancer. However, additional research will have 
to be carried out before this can be stated with certainty. 

The fact that few cigar and pipe smokers inhale while most cigarette smokers 
inhale may be the reason (or at least one of the reasons) why the death rate of 
cigarette smokers has been found to be much higher than the death rate of 
cigar and pipe smokers. On the other hand, differences in inhalation do not 
appear to account for the difference in death rates between cigar and pipe 
smokers which has been reported. 

It is possible that replies to the questions on inhalation were far from 
accurate. Some measure of confidence is had from the fact that Schwartz and 
Denoix found the same relationship between inhalation and amount of ciga- 
rette smoking as was found in this study. Furthermore, the results of this 
study are in agreement with the results of a study conducted for the Tobacco 
Manufacturers’ Standing Committee of Great Britain edited by G. F. Todd 
[9]. They also found that the proportion of cigarette smokers who say that 
they inhale increases with the number of cigarettes smoked per day and de- 
creases with age. It is of interest, that, holding constant both age and number 
of cigarettes smoked per day, they found a smaller proportion of inhalers 
among women than among men. Of course, it is conceivable: (1) that old men 
are less willing than young men to admit to inhaling tobacco smoke even 
though they admit to the same amount of smoking, (2) that men who admit to 
heavy smoking are those who will also admit to deep inhalation, and (3) that 
cigar smokers who inhale deeply will not admit the fact while cigarette smokers 
have no such compunction. If so, all three of these studies could be misleading 
in respect to inhalation. 

On the other hand, there is an explanation for the difference in inhalation 
between cigar smokers and cigarette smokers which makes the findings seem 
entirely reasonable in this respect. (I hasten to add that apparent reasonable- 
ness does not prove that a finding is correct.) Cigar smoke is alkaline while 
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cigarette smoke is neutral, and most people find cigar smoke far stronger than 
cigarette smoke in both taste and odor. Many people who inhale cigarette 
smoke say that they cannot inhale cigar smoke to the same degree without 
feeling ill. The smoke from pipes is quite variable, depending on the type of 
tobacco, the size of the bowl, etc., but in general, is heavier than cigarette 
smoke. 
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PANEL MORTALITY AND PANEL BIAS 


Marion Gross Sospo. 
Survey Research Center* 


Parts I and II of this paper evaluate “panel mortality” by studying 
the demographic structure and original interest in the subject matter 
of the study, at the time of the first and for each of the four subsequent 
interviews. Because of cancelling variations, the demographic structure 
of the panel after five rounds of interviewing remained very similar to 
that of the original panel. There was some tendency, however, for a dis- 
proportionate number of renters, iow income people, and people not 
interested in the subject matter of the study to drop out after repeated 
interviews. The third part of this paper evaluates “repeated interview 
effects” by comparing the answers of panel members to the answers of 
members of a new probability sample of the urban, non-institutional 
population of the United States, who were interviewed at the same time 
on the same questions. Once the effects of differing income distribution 
in these groups were eliminated, there was little indication that the 
attitudes of a panel after four rounds of interviewing differed from those 
of a random sample. 


HE panel method, consisting of repeated interviews with the same sample 
Ta respondents, is uniquely valuable for the study of questions such as 
those concerning attitude formation and change, or purchase and savings de- 
cisions and their fulfillment. However, if it is found that serious bias arises 
because of repeated interviewing or because of the fact that certain types of 
people are more likely than others to drop out of the sample, the uses of the 
panel method may be seriously limited. 

There are at least two major ways in which such bias may arise. First, it may 
be that particular groups of people (for example, pessimistic people) are more 
likely to drop out than others. We shall call bias that arises in this fashion 
“panel mortality.” Then, repeated interviewing may cause changes in atti- 
tudes. These effects will be called “repeated interviewing effects.”! Panel bias 
may also arise in a third way: people whose attitudes have changed in certain 
ways may drop out of the sample. For example, if people who have recently 
become more pessimistic tend to drop out, the panei will be more optimistic 
than a random sample. This effect is really a result of panel mortality; however, 
since it cannot be distinguished from “repeated interviewing effects” it will be 
considered along with these effects. A fourth type of bias, “interviewer effect,” 
may arise from repeated interviews with the same interviewer. Since the num- 





* This study was made possible by a grant from the Ford Foundation to the Survey Research Center for studies 
to be carried out under the direction of George Katona on the origin and effects of changes in economic attitudes. 
The questions asked in the interviews concerned spending and saving by the respondents, and their economic in- 
formation and attitudes. 

The author would like to express special thanks to Dr. George Katona, Dr. Eva Mueller and Dr. Leslie Kish. 
Dr. Mueller helped in the formulation of the analysis plan and offered guidance throughout the course of the re- 
search, Dr. Katona has made suggestions at each stage and was especially helpful in the final presentation of this 
paper. Dr. Kish devised the significance test used in table 66. Robert Hsieh and Robin Barlow participated in the 
analysis. 

1 This effect has been referred to as the “Hawthorne effect” by Lazarsfeld in his article “Repeated Interviews 
as a Tool for Studying Changes in Opinion and Their Causes,” American Statistical Association Bulletin, 1941, Vol. 2, 
3-7. 
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ber of families contacted by each interviewer was not large enough to permit 
significant tests of this effect, we shall not study it here. 

In 1954-1957, the Survey Researck Center carried out a panel study of 
economic attitude formation and change; of plans to buy cars and durable 
goods, and of the fulfillment of these plans. This study was specifically designed 
to permit investigation of panel bias. The study consisted of five waves of inter- 
views with the same family units over a three-year period. While Waves II, III, 
and IV were conducted, other random sample studies asking identical questions 
were in the field. Thus we are able to compare the attitudes of the panel and 
those of a random sample interviewed at the same time for three waves of inter- 
viewing. 

Previous studies of “drop-outs” in consumer panels have confined themselves 
to a discussion of the demographic effects of panel mortality. However, some 
other work has been done to characterize the drop-outs. In a political survey, 
Lazarsfeld found that there is a tendency for people who have responded 
“don’t know” to a number of questions on earlier interviews to drop out at 
some later date.? In a pilot study of purchases of consumer durables, Ferber 
found that although there was no relation between intended purchases per 
family and ultimate panel status, those families who remained in the panel re- 
ported an appreciably larger number of purchases in the recent past than those 
who dropped out in later waves.’ This finding suggests that people who are 
interested in the subject matter of a panel are more likely than those not in- 
terested to remain in the panel. 

In recent years there have been three panel studies of shorter duration but 
of similar content to the panel considered here, for which results of panel mor- 
tality have been published. The Life Study of Consumer Expenditures con- 
ducted by Alfred Politz Research, Inc., interviewed members of each household 
four times in a thirteen week period.‘ It found that the income of its panel was 
appreciably below that of a random sample. 

A one-year reinterview study conducted in 1948-49 in urban areas by the 
Survey Research Center for the Federal Reserve Board showed few differences 
between a panel and a reinterview sample. There were more primary spending 
units in a panel than in a random sample.® 

A similar one-year reinterview study conducted in England by the Oxford 
Institute of Statistics found that the number of persons in the income unit was 
the most important factor in determining the response rate of panel members. 
The larger the income unit the more likely it was to remain in the panel. In 
addition, the demographic characteristics of this panel at the time of the sec- 
ond interview differed little from those of a random sample.® All discussions 
here dealt with panel mortality. None of these studies considered “repeated 
interview” or “attitude change” effects. 





2 Ibid., 4-7. 

* Ferber, Robert F. “Observations on a Consumer Panel Operation,” Journal of Marketing, January, 1953, 
246-252. 

4 Life Study of Ci Expenditures, Vol. 1, 1957. Alfred Politz Research, Inc., 11. 

§ Lansing, John B. “Concepts Used in Surveys”, Contributions of Survey Methods to Economics by Katona, G., 
Klein, L. R., Lansing, J. B. and Morgan, J. N., Columbia University Press, New York, 1954, 45-48. 

* Vandome, Peter. “Aspects of the Dynamics of Consumer Behavior,” Bulletin of the Oxford Institute of Sta- 
tistics, Vol. 20, February, 1958, No. 1, 65-105. 
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The first part of this paper examines the effects of panel mortality on the 
demographic structure of the panel. The second part of this paper examines the 
relationship between degree of interest in the subject matter of the study and 
panel structure. Generally we find that people who are lost for different reasons 
(moving vs. refusing to be interviewed) differ in such an offsetting manner that 
the demographic characteristics of the panel after five waves of interviewing 
differ fetle from what they were at the beginning of interviewing. Secondly, 
people who do not follow economic news and events, an important portion of 
the subject matter of this study, are more likely to drop out than people who do 
follow such matters. The third section of the paper will compare economic atti- 
tudes of panel members in Waves II, III, and IV with those of a random 
sample interviewed at the same time to see if there is any indication of “re- 
peated interview” or “attitude change” effect. 


1, PANEL MORTALITY 


This study consisted of five waves of interviews taken with the same families 
over a period of two years and eight months. The first four interviews were 
spaced at six-month intervals and the final interview took place about a year 
after the fourth interview. The panel was based on a probability sample of the 
urban, noninstitutional population of the United States. The sample drawn in 
the central office of the Survey Research Center consisted of 1,358 urban ad- 
dresses.’ In the first wave complete interviews were received with the predesig- 
nated respondent (alternately husband and wife of the primary family) at 1,153. 
of these addresses. At the end of the fifth round 707 original respondents re- 
mained in the sample, with whom five consecutive interviews were completed 
(Table 55a). Thus sixty-one per cent of the panel remained through three 
years and five rounds of interviews 6n economic attitudes, purchases and pur- 
chase plans, and income and asset changes. 

It is seen in Table 55b that the number of movers declined after the first 
wave of interviewing. The rise in the number of movers between Waves IV 
and V probably occurred because over a year elapsed between these inter- 
views. Secondly it is noteworthy that after each interview the field staff be- 
came more successful at following movers. Possibly this indicates that those 
people who remain in a panel are more attached and more likely to offer infor- 
mation about their future whereabouts. Refusal, the most important reason 
for dropping out on the first and second wave of interviewing, declined in sub- 
sequent waves (Table 55a). Thus, in the final waves of interviewing, almost 
an equal number of respondents were lost who moved and could not be fol- 
lowed, who refused to be interviewed, and who were unavailable for interviews. 


Demographic Aspects of Panel Mortality 


This section will compare panel members (people who were interviewed five 
times) and panel losses on the basis of demographic information obtained in the 
first wave of interviewing. These figures are tested for significant differences at 
the ninety-five per cent probability level on the basis of sampling errors of dif- 





7 The addresses were distributed over 41 primary sampling points, mostly counties. 
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TABLE 55a 
PANEL MORTALITY 








Number of Interviews Interviews Taken in % of 
Number and Date 


of Wave 





Taken Lost Addresses Interviews 
in Wave I 





Addresses sent out (1,358) 100% 


Wave I, June 1954 1,153 
Wave II, December 1954 958 
Wave III, June 1955 856 
Wave IV, December 1955 819 
Wave V, March 1957 707 











TABLE 55b 
DIFFERENT KINDS OF PANEL LOSSES IN EACH WAVE 








Movers Not Followed Refused Unavailable! 
Wave 





Number PerCent*| Number Percent*| Number Per cent* 





110 
30 
21 
12 
36 


209 














TABLE 55c 
NUMBER OF MOVERS 








Moved according to: Followed? Not followed 





Wave II 
Wave III 
Wave IV 
Wave V 


Total 














* In per cent of addresses sent out. 

¢ In 238 cases persons who moved were followed; however the final panel contains only 153 people with whom 
interviews were completed at two or more addresses. This discrepancy can be accounted for by the fact that some 
people moved more than once and some people were followed and reinterviewed once but refused a subsequent in- 
terview. One hundred sixty-two respondents were lost because they moved and could not be followed. 

1 Unavailable (not at home after repeated calls, designated respondent not available, death, sickness, language 
difficulties in any of the five waves.) 

2 Movers followed (people followed and interviewed at a new address in Waves II, III, IV, or V). 
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TABLE 56 


APPROXIMATE SAMPLING ERRORS OF DIFFERENCES! 
(Expressed in Percentages) 








Size of Sample or Subgroup 





Size of Sample 
or Subgroup 1,000 700 500 300 





For percentages from about 35% to 65% 





4.5-6.2 4.9-6.8 


6. 
6. 
Fs 
8. 





For percentages around 20% and 80% 





3.6-5.0 .4- 
AY fo 
.1- 





For percentages around 10% and 90% 





2.7-3.7 3.0-4.1 3- 


3.3-4.4 
3.2-4.4 3.5-4.7 
3.8-5.1 


PEPER? 





For percentages around 5% and 95% 





1,000 1.92.7  2.1-2 
700 2.3-3 
500 
300 
200 


Tre Tt 


9 
2 


.- 
.0- 
_ 
6- 





me & CG CH 





1 The values shown are the differences required for significance (95 per cent probability) in comparisons of per- 
centages derived from two surveys or from two different subgroups of the same survey. Two values, low and high, 
are given for each cell. The lower values are based on the standard error formula for simple random samples. The 
higher values are based on extensive computations of individual sampling errors carried out on actual survey data 
and allow for the departure. such as stratification and clustering, from simple random sampling in the survey design. 


ferences (expressed as percentages) which have been calculated for similar 
studies. (See Table 56.) 

A. Degree of Urbanization—Table 57a reveals that panel losses are slightly 
more frequent in large metropolitan areas (seven of the largest cities of the 
country, including suburbs) than in other cities; however, these differences are 
not statistically significarit. These differences are small because losses for dif- 
ferent reasons seem to have differing, offsetting characteristics. Thus persons 
who refuse to be interviewed are less likely to come from large cities than are 
persons who move and cannot be followed. 





PANEL MORTALITY AND PANEL BIAS 57 


B. Age of Family Head—Although the mean and median of age of family 
head remain approximately the same among panel losses and panel members, 
there is a tendency for persons under twenty-five and those sixty-five and over 
to drop out more frequently than people in other age groups. Thus only 
seventy-six per cent of panel losses, but eighty-four per cent of panel members, 


TABLE 57a 
PLACE OF RESIDENCE OF PANEL MEMBERS AND PANEL LOSSES 








Panel Members Panel Losses! 





a 3 Mov- 

Place of Residence — An et not Re- 

lowed* fol- fused 
lowed 


Un- 





Large metropolitan areas and 
their suburbs 39% 37% 34% | 41% 45% 37% 

Other cities 50 52 54 48 44 49 

Towns under 2,500 population 1 11 12 11 11 14 


Total 100% 100% 100% |100% 100% 100% 100% 
Number of families 1,153 707 153 446 162 190 104 














TABLE 57b 
AGE OF FAMILY HEAD OF PANEL MEMBERS AND PANEL LOSSES 








Panel Members Panel Losses 





Age of Family 
Head Movers Movers Re- 
All not Prose 


followed followed 





Under 25 11% 
2 27 


28 
16 

9 
65 and over 8 
Not ascertained | | 1 














Total 100% 100% 100% 100% 100% 100% 








1 Panel losses (mortality in 2nd, 3rd, 4th or 5th wave). 
2 Unavailable (not at home after repeated calls, designated respondent not available, death, sickness, language 


difficulties in any of the five waves). 
8 Movers followed (people followed and interviewed at a new address in Waves II, III, IV, or V. 


fall in the age range from twenty-five to sixty-four. As Table 57b indicates, 
persons who refuse to be interviewed are significantly older than persons who 
move and cannot be followed. Persons who are unavailable for interview tend 
to be even older than those who refuse. This phenomenon is understandable in 
light of the fact that people who were not interviewed because they were sick 
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or had died during the course of this study are classified in this category. 
Further tabulations indicate that panel losses are more frequent among fami- 
lies having no children than among families with children. However, the final 
composition of the panel is not substantially altered by these differences. 

C. Education of Head—Turning to education, we find that the panel contains 
a slightly higher proportion of college educated members and fewer members 
with some high school education than are found among panel losses. However, 
these differences are insignificant because of offsetting variations (Table 58). 


TABLE 58 
EDUCATION OF FAMILY HEAD OF PANEL MEMBERS AND PANEL LOSSES 








Panel Members Panel Losses 





Movers 
Education Movers seit 
not Re- 
All fol- All ‘ 
fol- fused 
lowed 


lowed 





Grade school 36% 29% | 36% 30% 34% 
Some high school plus non- 
academic 21 18 26 25 26 27 
Completed high school plus 
nonacademic 23 23 22 22 28 20 
Some college (completed col- 
lege) 19 22 22 15 14 16 
Not ascertained 2 2 1 2 2 3 














Total 100% | 100% 100% | 100% 100% 100% 100% 





D. Occupation of Head—There is little difference between the occupational 
composition of panel members and panel losses. However, this relative uni- 
formity is caused by offsetting differences among persons who refuse to be inter- 
viewed, persons who move and cannot be followed, and persons who are un- 
available for interview. Persons who refuse to be interviewed include signifi- 
cantly more self-employed businessmen and managers and fewer unskilled 
workers than does the group of persons who move and cannot be followed. 
Persons who are unavailable for interview include significantly more house- 
wives than are found among persons who move and cannot be followed, and 
more unskilled workers than are found among those who refuse to be inter- 
viewed (Table 59a). 

E. Home Ownership—Studies of plans to purchase durable goods and of the 
fulfillment of these plans are related to home ownership. Therefore it is inter- 
esting to note that panel losses contain a significantly higher per cent of renters 
than panel members. As would be expected, persons who move and cannot be 
followed tend to be renters, while persons whe refuse to be interviewed and 
those who are unavailable tend to be home ow: or. (Table 59b). 

F. Income Distribution—When panel mem...s3 and panel losses are com- 
pared, there emerges a barely significant tendency for panel losses to have a 
lower income than panel members. 34 per cent of panel losses earned $5,000 or 
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TABLE 59a 
OCCUPATION OF FAMILY HEAD OF PANEL MEMBERS AND PANEL LOSSES 








Occupation 


Panel Members 


Panel Losses 





Movers 


All fol- 


lowed 


Movers 
not 
fol- 

lowed 


Re- 


All fuse 


Un- 
avail- 
able 





Professional, semi-professional 


10% 1% 


11% 


Self-employed, businessmen, 
managers 

Clerical and sales 

Skilled workers 

Unskilled workers 

Retired 

Unemployed 

Housewives 

Others (farm operators, stu- 
dents) 

Not ascertained 


8 
13 
38 
17 


—- wre 
ores -31 0 WO 


i) 





Total 











100% | 100% 100% | 100% 100% 100% 100% 





more in June 1954, while 43 per cent of panel members earned $5,000 or more 
(Table 60). 

G. Sex of Head of Family Unit—In a recent panel study conducted at the 
Oxford Institute of Statistics* it was found that possibilities of reinterview were 
smaller if the head of the income unit were a woman rather than a man. In this 
study, though there was a somewhat smaller chance of obtaining a second inter- 
view if the head was a woman, there was no significant difference in response 
rate between male and female heads. 

H. Number of Persons in Income Unit—Similarly, the Oxford study found 
that the more people there were in the family, the more likely they were to suc- 


TABLE 59b 
HOME OWNERSHIP OF PANEL MEMBERS AND PANEL LOSSES 








Panel Members Panel Losses 





Home 


V a 
Ownership warmers 


Movers 
not 


followed 


Movers 
followed 


Unavail- 


All able 


Refused 





Own 59% 65% 27% 30% 
Rent 39 33 72 67 
Other 2 2 1 3 


60% 63% 
38 37 
2 at 











Total | 100% 100% 100% 100% 100% 100% 

















* Vandome, op. cit., 69-70. 
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ceed in obtaining a second interview. Our figures show no differences in re- 
sponse rates among families of different sizes. 


TABLE 60 
1954 FAMILY INCOME OF PANEL MEMBERS AND PANEL LOSSES 








Panel Members Panel Losses 





Movers Movers — \ 
All fol- not Refused able 
lowed followed 





Under $2,000 12% 10% 11% 14% 
$2,000-—4,999 43 44 47 
$5,000—7,499 25 27 17 
$7,500 and over 15 16 13 
Not ascertained 5 3 9 





Total 100% 100% 100% 100% 100% 100% 100% 














1955 FAMILY INCOME FROM THE PANEL STUDY 
AND FROM A NEW SAMPLE 


(Percentage distribution of urban respondents) 








I Urban Respondents Wave IV of 
——— New Sample Panel Study 





Under $3,000 26% 
$3,000—4,999 33 
$5,000-—7,499 23 
$7,500 and over 16 
Not ascertained amount earned 2 


Total 100% 100% 
Number of cases 1,544 844 











I. The Effect of Moving on Response Rate—Recent census data indicates that 
approximately 17 per cent of the population move in the course of one year. 
Since this study lasted for two years and eight months, one would expect a 
large amount of attrition because of moving. Anticipating the effects of resi- 
dential mobility, attempts were made to follow the panel members who 
changed addresses. This was done first through letters, to be forwarded by the 
Post Office, to which answers were requested ; second, by inquiring about mov- 
ing plans in each interview; and third, if the family was found to have moved, 
through personal inquiries of the neighbors by the interviewers. (This last 
method proved most useful.) When new addresses were obtained, persons who 
moved were followed provided they remained in the same community, includ- 
ing its suburban areas, or had moved to one of the primary sampling points in 
which this survey was conducted. 

At the end of two years and eight months four hundred moves had been 
recorded among panel members. Approximately 33 per cent of the panel had 
moved. However, this figure should not be used as an estimate of mobility 
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among the population in general since many families moved two or three times 
within the course of interviewing and many people who left the sample for other 
reasons such as refusal may have moved at some later date. 

Of the final panel members, 22 per cent were people who had been followed 
to a new location. If these people had not been followed, the panel after five 
interviews would have contained only 48 per cent of the original sample. 
Furthermore, the panel would have contained a higher proportion of older 
people, of people who owned their homes, and of people with incomes over 
$5,000 (see Tables 57a-59b). Thus, following movers is a very important factor 
in maintaining an unbiased distribution among members of a panel. 

* * * 


In all these comparisons we have compared panel losses with panel members 
on answers to questions asked in the first rounds of interviewing. There is 
another type of comparison which is of interest in determining representative- 
ness of this sample after several rounds of interviewing. This is a comparison of 
the demographic characteristics of all people who remained in the panel 
throughout the entire five rounds of interviewing (panel members) and all 
people who were questioned at the time of the first interview (original sample). 
This original sample is made up of panel members and panel losses. When these 
two groups (panel members and the original sample) are compared, there are 
no significant differences in place of residence, age, education, occupation, or 
income, or home ownership (Tables 55a—67a). It should be remembered, how- 
ever, that this lack of difference is attributable to cancelling variations. 

A comparison of panel members and panel losses on income figures reveals 
that panel losses contain a significantly larger percentage of people who gave 
vague and inadequate information about their incomes in the first round of in- 
terviewing than those who stayed on as panel members. This finding suggests 
that refusal is a gradual process; many people who refuse to answer a few 
questions (such as income questions) in the first and second interviews may 
refuse to be interviewed in later rounds. Second, when we compare the 1955 
income of panel members and a random sample interviewed at the same time 
period (lower part of Table 60), we find that panel members in the fourth 
wave have a significantly higher income. This difference is greater than the dif- 
ference in income on the first round between panel members and the original 
sample. This might suggest that people who experience income gains may be 
more likely to remain in a financial study than those whose incomes have fallen. 


2. PANEL MORTALITY AND INTEREST IN THE SUBJECT MATTER OF THE SURVEY 


Other analyses of panel drop-outs have suggested the possibility that people 
who are not interested in the subject matter of the survey are most likely to 
drop out of a panel. Two areas of interest especially important in this study 
were consumer durable goods purchases and knowledge and attitudes about 
current economic happenings. 


A. Purchase Expectations 


1) House purchase expectations on Wave I of panel members and panel 
losses do not differ appreciably. Those who moved and could not be fol- 
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lowed expected to buy a house more often than the sample as a whole, 
but these people were balanced by those movers who were followed. 

2) Car purchase expectations on Wave I of those who remained in the sample 
for four interviews were slightly higher than for those who dropped out 
during the course of the succeeding three interviews. These differences 
were extremely slight and are therefore inconclusive. 


B. “Keeping Up” with Current Events 


The questions asked in these surveys dealt with economic developments in 
the country, as well as with the respondents’ personal financial experiences. 
Regarding the first aspect, news of business developments, a measure of inter- 
est was obtained in the first wave. The question asked in the first wave read as 
follows: “Now about news of how business is going in the country, do you 
people follow such news regularly, occasionally, or hardly at all?” Those who 
said that they do follow news, either regularly or occasionally, were also asked, 
“What kind of things have you heard recently?” With the help of this question, 
it was possible to identify those people who answered the first question in the 
affirmative but later were unable to recall any business news. As indicated in 
Table 63a, two extreme groups emerge from this inquiry: 23 per cent of the 
representative sample interviewed in Wave I said that they regularly follow 
economic news and also referred to specific recent news; on the other hand, 28 
per cent said that they hardly follow economic news. There are some differences 
in these groups among panel members and panel losses. The proportion of those 
who do not follow business news is 33 per cent among panel losses and 26 per 
cent among panel members. The proportion of these “uninformed” people is 
highest among persons who are unavailable for interview and persons who move 
and cannot be followed; it is lower among persons who refuse to be interviewed. 

The higher the income the greater the proportion of those who follow busi- 
ness news. Therefore, it is necessary to study whether or not the differences ob- 
served in Table 63a persist within the lower income group and within the upper 
income group, each taken separately. The findings are presented in Table 63b. 
They indicate that there are still some differences between panel members and 
panel losses, but they are much less pronounced than those in Table 63a. 

These findings do not necessarily indicate that income differences represent 
the essential factor. It is possible that among low income people lack of interest 
in business news, as well as in other topics of the interview, makes for unavaila- 
bility in later interviews. The income differences between panel members and 
panel losses might then reflect the differential rate in panel mortality among 
those who are interested and those who are not interested in remaining in the 
panel. Such an explanation, though far from proved, would be in accordance 
with the hypothesis proposed previously about “don’t knows” and uninformed 
people. 


3. REPEATED INTERVIEW AND ATTITUDE CHANGE EFFECTS 


As we have seen in Parts 1 and 2, in all respects except home ownership, in- 
terest in the subject matter of the survey, and slight income differences, the 
panel after five rounds of interviewing remained similar to the original sample. 
In this section we will first determine whether there are any differences in the 
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TABLE 63a 
FOLLOWING BUSINESS NEWS BY PANEL MEMBERS AND PANEL LOSSES 








Panel Losses* 





Panel 


Members! Movers 


not Refused 
followed 


Follow News Unavail- 


able 





Regularly—mention 
economic news 20% 22% 20% 18% 
Occasionally—mention 
economic news 15 15 14 17 13 
Follow—but do not men- 
tion economic news 32 31 26 33 31 
Hardly at all 26 33 37 27 38 
Uncertain, not ascer- 
tained 2 1 1 3 -— 














Total 100% 106% 100% 100% 100% 





TABLE 63b 


FOLLOWING BUSINESS NEWS BY UPPER AND LOWER 
INCOME PANEL MEMBERS AND PANEL LOSSES 








Panel Losses? 





Follow News Movers 
All not Refused 
followed 


Unavail- 
able 





Income Under $5,000 





Regularly—mention economic news 16% 
Occasionally—mention economic news 14 
Follow—but do not mention economic news 31 
Hardly at all 38 
Uncertain, not ascertained 1 








Total 100% | 100% 
Number of cases 433 205 





Income Over $5,000 





Regularly—mention economic news 
Occasionally—mention economic news 
Follow—but do not mention economic news 
Hardly at all 

Uncertain, not ascertained 





Total 
Number of eases 











* Less than half of one per cent. 
1 Interviewed four times (first 4 waves). 
2 Dropped out in Waves II, III, and IV. 
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economic attitudes in the first round of interviewing between those who stayed 
in the panel for five rounds of interviewing (panel members) and those who 
dropped out (panel losses). Secondly, we shall compare answers on attitudinal 
questions of our panel and of random samples who were asked the same ques- 
tions during the same time period. When Waves II, III, and IV were in the 
field as said before, there also was another random sample study in the field, 
in which identical questions were asked. These comparisons of the answers of 
a panel and the random sample should give some indication of whether there 
is any effect of repeated interviewing on answers to attitudinal questions or 
whether people whose attitudes have changed in some particular way are more 
likely than others to drop out of a panel. (For the questions studied here see 
questions following Table 66.) 

In Table 66, panel members and panel losses are compared regarding their 
answers to four economic attitude questions asked on the first round of inter- 
viewing. Although panel members seem slightly more optimistic than panel 
losses, there are, with one exception (the “don’t knows” in the questions on 
economic conditions next year), no significant differences in the original answers 
of people who remained for five rounds of interviewing and the answers of 
people who dropped out during the course of the study. 

Since the original attitudes of panel members and panel losses show little 
difference, the next question is, were there any unusual changes in the atti- 
tudes of panel members over time. These changes might be attributable to 
repeated interviewing, and hence sensitization about certain issues. Or, possi- 
bly, these differences could occur because people with certain types of attitude 
changes are more likely than others to drop out of a panel. 

We shall test for these effects by comparing the panel on Waves II, III, and 
IV with the answers of a random sample which was interviewed at the same 
time and asked the same questions. Tables 67a and 67b present the results of 
these comparisons. The figure in brackets represents o,,, (See footnote 9). 
Differences in indices were considered significant if they were equal to or 
greater than 2¢,,,. Since the panel was a random sample on the first round of 
interviewing (June 1954), the same figures are used for the panel and for a 
random sample in June 1954. On subsequent dates the panel and a new random 
sample interviewed at approximately the same time period are compared. 

Although there are no differences in the initial attitudes of panel members 
and losses (Table 66) when the panel is compared with a random sample on 
later rounds of interviewing, the panel is significantly more optimistic on all 
four economic attitude questions. The relative optimism® of the panel decreases 





* In order to make the answers to these questions comparable, an index was computed by adding 100 plus per 
cent good (in integer terms) minus per cent bad (in integer terms). In effect this is the same as weighting per cent 
good with 2, per cent who are the same or undetermined with 1, and per cent bad with 0, adding the tctal score, 
and multiplying by 100. Thus if 80% say good, 15% say bad, and 5% say the same or undetermined, the weight 
score is 165. A similar index was used in Katona and Mueller, Consumer Expectations, 1953-1956. Survey Research 
Center 1956, pp. 93. Leslie Kish, Head of the Survey Research Center, Sampling Section, has devised a method 
of testing the significance of differences between these weight scores by pooling the variances of these groupings. 
This method is used in this paper. 
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somewhat between Waves II and III and then increases in Wave IV. The only 
exceptions to the greater degree of optimism in the panel as compared to a 
random sample occur in June 1955 (Wave III), on personal economic status 
questions (“Are you better off now than you were a year ago?” “Do you expect 
to be better or worse off next year?”) 

Before this relative optimism of the panel is considered as showing “repeated 
interview” or “attitude change” effects, it should be remembered that the in- 
come distribution of the panel and a random sample differ. The panel contains 
a larger percentage of people in higher income brackets. As we have pointed 
out above, part of this difference may be due to the fact that people whose in- 
comes have changed for the worse may be more likely than others to drop out. 
Since the final sample contains somewhat more higher income people than the 
random sample interviewed during the same time period, and since optimism 
on economic attitude questions is correlated positively with income, comparison 
of panel attitudes with those of a random sample should hold income constant. 

Table 67b shows that for each income group up to $7500, the panel after 
four rounds of interviewing shows no difference in attitude from a random 
sample. For people with incomes of $7500 and over, however, the panel is more 
optimistic on all four questions. Since this income category is open-ended at 
the top, it may be that the panel consists of more very high income people than 
does a random sample. From available data it is not possible to check this hy- 
pothesis. In general, however, there seems to be little evidence that attitudes 
of the panel, after four rounds of interviewing, differ from those of a random 
sample. 


4. CONCLUSIONS 


When the effects of panel mortality are considered, the panel seems to vary 
little from the original sample with the exception of the finding that people 
who are uninterested or uninformed about the subject matter of the study are 
more likely to drop out than others. There is some bias toward retaining more 
home owners (because of loss of movers) and more high income people after 
repeated interviews. One of the chief reasons for finding few changes in the 
demographic structure of the panel is the cancelling variation caused by the 
differing characteristics of persons who move and cannot be followed, persons 
who refuse to be interviewed, and persons who are unavailable for interview. 
If a persistent scheme of following movers had not been used, the panel after 
five rounds of interviewing would have shown more bias. 

As far as effects due to repeated interviewing or change in attitude are con- 
cerned, once the effects of income are eliminated there seems to be little indi- 





Np =number of cases in panel sample 

N, =number of cases in random sample 

Gp =per cent of cases in panel sample having optimistic opinions 

G, =per cent of cases in random sample having optimistic opinions 

S, =per cent of cases in panel sample thinking things will remain the same or offering no opinion. 
S, =per cent of cases in random sample thinking things will remain the same or offering no opinion. 
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TABLE 66 
ECONOMIC ATTITUDES OF PANEL MEMBERS AND PANEL LOSSES 








Panel Losses? 





Panel 
Mem- Movers 

bers! All not Pao d 
followed *“%° 





. Better Off than a Year Ago 
Better 29% 28% 
Same, pro-con 35 51 
Worse 32 21 
Uncertain 3 — 
Not ascertained 1 _ 





. Expect to Be Better Off Next Year 
Better off 
Same, pro-con 
Worse 
Don’t know 
Not ascertained 





. Expected Change in Business Con- 
ditions in Next Twelve Months 
Better 
Same 
Worse 
Don’t know 
Not ascertained 





. Good or Bad Time to Buy House- 
hold Goods 
Good 
Pro-con 
Bad 
Don’t know, depends 
Not ascertained 





100% 100% 
Number of Cases 339 120 155 64 














* Less than half of one per cent. 

1 Interviewed four times (first 4 waves). 

2 Dropped out in Waves II, III and IV. 

The questions were: “We are interested in how people are getting along financially these days. Would you say 
that you and your family are better off or worse off financially than you were a year ago?” “Now looking ahead—do 
you think that a year from now you people will be better off financially, or worse off, or just about the same as 
now?” “How about a year from now, would you expect that in the country as a whole business conditions will be 
better or worse than they are at present, or just about the same?” “Now about things people buy for their house—I 
mean furniture, house furnishings, refrigerator, stove, TV and things like that. Do you think now is a good time or 
a bad time to buy such large household items?” 





PANEL MORTALITY AND PANEL BIAS 


TABLE 67a 


67 


COMPARISON OF ATTITUDES EXPRESSED IN PANEL STUDIES AND IN 
NEW SAMPLE STUDIES CONDUCTED AT THE SAME TIME 


Index = 100% +% good — % bad 








I 
June 1954 


II 


December 1954 


III 
June 1955 


IV 


December 1955 





Better Off Than a Year Ago 





Panel 
New sample 





109 
109 





115 


108 (3.3) 


117 


116 (3.0) 








Expect to Be Better Off 


Next Year 





Panel 
New sample 


129 
129 


131 


124 (2-8) 


130 


133 (-8) 





Business Conditions in Next Twelve Months 





Panel 
New sample 





133 
133 





139 
148 @-3) 





177 
169 (2.3) 








Good Time to Buy Household Goods 





Panel 
New sample 


118 
118 


141 


125 3-9) 


154 
143 (3.1) 


154 
13g 3-3) 





TABLE 67b 


COMPARISON OF ATTITUDES EXPRESSED BY DIFFERENT 
INCOME GROUPS IN WAVE IV WITH NEW SAMPLES 


Index = 100%+% good—% bad 








Under $2,000 


$2,000—4,999 


$5,000—-7,499 | $7,500 and over 





Better Off Than a Year Ago 





Panel (Wave IV) 


New sample 


94 
86 (7.3) 


108 
112 o.5) 


139 
137 (5.4) 





Expect to Be Better Off Next Year 





Panel (Wave IV) 


New sample 





94 
108 (7.2) 





108 


134 4°) 


139 


139-8) 








Business Conditions in Next Twelve Months 








Panel (Wave IV) 


New sample 


155 


135 8-3) 


172 
165 tied 


180 
175 (4.2) 





Good Time to Buy Household Goods 





Panel (Wave IV) 


New sample 





* 137 


127 (9.3) 





153 


135 (5.2) 


153 


148 ‘°-4) 
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cation that the attitudes of a panel after four rounds of interviewing differ from 
those of a random sample. This indicates either that there is no sign of these 
effects or possibly that these effects have cancelling influences. 

A final conclusion concerns the methodology of panel studies. Even if costs 
are disregarded and extensive efforts to follow movers are made, substantial 
panel mortality is inevitable. If the primary aim of the study is testing changes 
in relationships over time, panel mortality may be of small concern. However, 
there is some evidence that in financial and economic attitude surveys it is 
likely that lower-income people are more apt to drop out. It may be better to 
oversample at the outset those groups most likely to drop out, rather than to 
bring the panel up to size with random replacements made at some later date. 
Finally, it should be noted that the subject matter of the study may well in- 
fluence the type of drop-outs. Thus, possibly there would be fewer losses among 
low-income people in a study of potato consumption than in a study of caviar 
consumption. 





SOME PROBLEMS OF THE HOUSEHOLD INTERVIEW DESIGN 
FOR THE NATIONAL HEALTH SURVEY* 


Haroup NIssELSON AND THEODORE D. Woo.LsEy 


Evidence from earlier surveys and from a pretest conducted in Char- 
lotte, North Carolina, was used in making certain decisions about the 
conduct of the interview for the National Health Survey. 

The inconclusiveness of evidence on the use of proxy respondents 
and on between-interviewer variance led to decisions to accept proxy 
respondents under certain conditions, anc to continue with plans to 
use a staff of about 140 interviewers, but to accumulate further evidence 
on both these matters on a continuing basis. 

Check lists of diseases again proved efficacious in the Charlotte pre- 
test. A recall period of two weeks was adopted for most illness and 
medical and dental care data, but it was decided not to attempt to 
count separate attacks of chronic illness. Certain procedures were 
adopted to improve the codability of disease and injury information 
secured. 


1. INTRODUCTION 


MAJOR part of the National Health Survey Program, authorized by the 

84th Congress in the summer of 1956, is a continuing sample survey of the 
nation’s households to collect information on the incidence and prevalence of 
illness and injury, disability, hospitalization, the utilization of medical and 
dental services, and related subjects. This survey is now under way. 

The start of field work on a nationwide basis came almost exactly ten months 
after President Eisenhower signed the National Health Survey Act. During this 
period there were many technical decisions that had to be made regarding the 
design of the survey. These decisions were made jointly by staff of the Census 
Bureau and of the Public Health Service. (The latter agency has the central 
responsibility for the National Health Survey Program under the Act.) 

As part of the preparatory work the questionnaire and survey materials for 
the national survey were pretested on a sample of about 1,000 households in 
Mecklenburg County, North Carolina, including the county’s major city, 
Charlotte. This pretest provided a dress rehearsal of the survey materials and 
an opportunity to train the survey staff for the national operation. Data from 
the pretest were also used to review certain of the technical decisions made in 
the design of the survey and to obtain evidence on some additional points. 

Some of the decisions that had to be made regarding the design of the house- 
hold survey had to do with problems inherent in the measurement of morbidity 
by means of interview surveys; others were concerned with the emphasis to 
be placed upon various types of data that might be obtained in this particular 
survey, for example, the matter of the emphasis to be placed upon obtaining 
data for short periods of time and upon time comparisons in general, and the 
question of what part of the available resources should be put into securing 





* Presented at the “Miscellaneous Session” of the Biometrics Section, American Statistical Association and 
the Biometric Society (ENAR), 117th Annual Meeting of the American Statistical Association, Atlantic City, 
N. J., Sept. 13, 1957. 

Harold Nisselson, Bureau of the Census; Theodore D. Woolsey, Public Health Service. 
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estimates for subareas of the United States, such as regions, States, or metro- 
politan areas. 

There were also issues that had to be resolved concerning sample design, field 
administration, control of quality of field work, and so forth. 

By and large the decisions concerning the content and definitions to be used 
were made on the basis of expert judgments and the interests of potential users 
of the survey data. An attempt was made to settle the technical questions, as 
far as possible, on the basis of evidence from earlier surveys and the Charlotte 
pretest. 

It is the purpose of this paper to describe how some of the major technical 
decisions for the new National Health Survey came to be made, but it must 
be made clear that these decisions are not by any means irrevocable. One of 
the principal advantages of a continuing survey is the opportunity it provides 
for a long-range campaign of methodological research and improvement. This 
peg is, in fact, specifically authorized by the National Health Survey 
Act [17]. 


2. PRINCIPAL PROBLEMS OF MORBIDITY SURVEY DESIGN RECOGNIZED 
Af THE BEGINNING OF THE NATIONAL HEALTH SURVEY 


Among the principal problems peculiar to or especially important in the de- 
sign of morbidity surveys are the following: 

First, the matter of accepting responses about illness of an adult absent at 
the time of the interview from a related adult present at the interview; in 
other words, this is the problem of what rules to set for the interviewer about 
interviewing adults for themselves as against accepting proxy respondents. 

Second, the control of error and bias associated with the interviewer. 

Third, the set of problems related to the use of a series of probes to improve 
the completeness of reporting of illness conditions, and especially the use of 
check lists for this purpose. 

Fourth, the underreporting, or misreporting, by the respondent of events 
occurring in a specified period of time prior to the interview. 

Fifth, the difficulty arising in attempts to define episodes of illness in the 
course of a chronic disease, that is, flare-ups or attacks. The beginnings and 
ends of such episodes seem to give special trouble from the point of view of 
response error. 

Sixth, the search for methods of improving the detail and reducing the re- 
sponse variability in obtaining descriptions of the nature of the disease, injury 
or impairment reported in the interview. 

Finally, perhaps the most widely discussed of the problems of the interview- 
type morbidity survey, the validity of the prevalence measures provided for 
specific chronic diseases. There have been four recent papers on this last subject, 
including one by the present authors which expresses our views on the subject 
at some length [7, 14, 19, 21]. Hence, we shall not discuss this particular matter 
further here, except to say that we do not believe that an interview-type survey 
by itself can possibly be expected to supply estimates of the prevalence of all 
diagnosable cases of specific chronic diseases. The National Health Survey Pro- 
gram recognizes the need for data of this sort and is going to attempt to provide 
them by other means. 
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Taking up each of the other six methodological matters in turn, we shall begin 
with a brief résumé of evidence from other studies, then report the experience 
of the Charlotte pretest, and finally show how the issue was resolved for the 
national survey. 


3. RESPONDENT SELECTION 


Recent studies have raised, again, serious question as to accepting responses 
about illness of an adult absent at the time of interview from a related adult 
present at the interview [4, 14]. Some experience indicates that a household 
respondent may fail to report as much as 20 to 25 per cent of all conditions— 
including, of course, the most minor—that would be reported by the persons 
themselves. Analysis seemed to show, however, that statistics subject to bias 
with the use of a household respondent were subject to just about as much 
gross error, and therefore response variability, even when all adults were inter- 
viewed for themselves. Thus, even with the presumably best respondent, re- 
sponse variability could significantly affect estimates for particular statistics 
or for individual cells of a tabulation representing small samples. Major effort 
in the planning for the National Health Survey was devoted to seeking more 
objective operational definitions of the concepts to be measured in order to 
reduce such response variability. In the Charlotte pretest an attempt was made 
to measure what had been accomplished by these steps with a view to deciding 
who should be considered an acceptable respondent. This was done by splitting 
the sample—with all adults to be interviewed for themselves in half the house- 
holds and household respondents to be accepted in the other half. 

The Charlotte pretest was based on a sample of 200 segments averaging an 
expected six addresses in size. The sample was for the most part a one-stage 
area sample selected from the entire Charlotte Standard Metropolitan Area, 
but differential sampling rates were used between urban and rural areas, and 
between white and nonwhite, so that the characteristics of the sample in those 
regards would reflect the United States distribution rather than that of the 
Charlotte SMA. The data presented in Tables 72-76 have been weighted for 
subsampling, if any, within the selected segments but not for the differential 
sampling rates by type of area. Apart from 32 segments assigned for interview- 
ing to the supervisory staff for the national survey as a part of their training, 
the sample was divided into four zones, each representing a quarter of Charlotte 
and of the balance of Mecklenburg County, and randomized among interview- 
ers within zone. The interviewer staff consisted of 24 interviewers who were 
randomly assigned among the zones, six interviewers per zone. The entire 
sample was listed prior to the start of interviewing, and questionnaires for 
addresses in the sample were prepared in advance for the interviewers. The 
questionnaires assigned to each interviewer were markedly alternately for each 
of the two respondent rules: In the households at half the addresses all adults 
were to be interviewed for themselves.' In the households at the other half, the 





1 Substitution of a household respondent was permitted for an adult not competent to report for himself, and 
at the very end of the survey some substitution of household respondents was pted on the assumption that 
this would be preferable to a person-noninterview. (About 3 per cent of the households in each of the two groups were 
entire noninterviews.) Hence the figures for per cent of persons interviewed for themselves shown in Table 73 
are under 100. 
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interviewer was to interview for himself each adult found at home and to obtain 
the information for absent related adults from the best qualified person among 
those found at home. Unrelated adults were, however, always to be interviewed 
for themselves. Also information about a child under 18 years of age was always 
to be obtained from a parent or other person responsible for his care. The per- 
centages of adults interviewed for themselves in households in which only 
adults found at home were interviewed (Table 74) are typical of experience 
in the California, Kansas City SMA, and Baltimore Health Surveys. The 
average number of calls per household was about 0.5 higher when all adults 
were to be interviewed for themselves (2.4 vs 1.9), a somewhat smaller increase 
than typical in other Census Bureau experience on a national basis. 


TABLE 72 


COMPARISON OF RATES OBTAINED WITH THE TWO RESPONDENT 
RULES: CHARLOTTE, N. C., PRETEST, NATIONAL 
HEALTH SURVEY, FEB. 1957 








Rate per Person 





Conditions (2-week prevalence) Days of Disability 





Major Minor 
Total Chronic Non- Non- Total Bed 
Chronic | Chronic 





(1) Households in which all adults were to be 
interviewed for themselves J .157 -096 


(2) Households in which adults found et home 
were to be interviewed for themselves and 
absent related household members -958 -734 -131 .093 . 563 


Difference: (1) —(2) -094 -065 -026 -006 — .033 — .062 

















Estimated sampling error of difference -042 -050 -014 -011 -146 .070 








Tables 72, 73, and 74 summarize some comparisons from the two halves 
of the sample. The difference in the total prevalence of illness conditions 
shown in Table 72 was more than twice its sampling error. For the major 
groupings of conditions into chronic, major non-chronic (involving either dis- 
ability or medical care) and minor non-chronic (all other conditions) the dif- 
ferences were not statistically significant but were all in the same direction, 
viz., a larger prevalence found in the households in which all adults were inter- 
viewed for themselves. 

Tables 73 and 74 were prepared in order to examine further the accept- 
ability of data obtained from the households in which proxy respondents were 
permitted. The purpose was to determine whether one would draw different 
conclusions from a particular set of data and from the corresponding data ob- 
tained when all adults were interviewed for themselves. 

The sampling and response variability in the data are too high to permit de- 
finitive conclusions as to possible biases in the use of a household respondent. 
There is a suggestion of a persistence in the direction of higher reporting of 
prevalence in the half of the sample in which all adults were to be interviewed 
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TABLE 73 


AGE-SEX RATES FOR PREVALENCE OF CONDITIONS AND DAYS 
OF DISABILITY: CHARLOTTE, N. C., PRETEST, NATIONAL 
HEALTH SURVEY, FEBRUARY 1957 
200A. HOUSEHOLDS IN WHICH ALL ADULTS WERE TO BE INTER- 
VIEWED FOR THEMSELVES 








Rate per Person 





Proportion 
Number of | of persons | Conditions (2-week prevalence) Days of 
Age—Sex | persons in | interviewed Disability 
sample! for Major | Minor 
themselves | Total |Chronic| Non- | Non- 
Chronie|Chronic| Total | Bed 











All ages 1.052 799 .157 -096 | 1.016 -601 
Male - 988 737 -161 -091 .926 -409 
Female 1.109 .855 .154 -100 | 1.096 -582 


Under 18 626 229 199 -097 697 -821 
Male .524 - 246 .201 .076 -550 .344 
Female .527 .213 .197 -117 .642 - 282 


18-34 .98 -990 778 105 .107 .689 -283 
Male -97 .912 734 .066 -113 -474 .128 
Female -99 .054 .814 138 -102 -865 -410 


35-64 97 -498 .256 147 096 -474 


Male 95 .397 121 -173 -103 -428 -503 
Female -99 -585 .371 -126 -088 .514 


65 and over -82 824 | 2.132 147 -O44 625 
Male .78 .194 | 1.970 .194 .030 .507 
Female -86 -449 | 2.290 -101 -058 .739 





























1 Weighted counts adjusted for subsampling within segments. 


for themselves. The differences for males in the age groups 18-34 and 35-64— 
for which, when the household respondent rule was used, there were the small- 
est proportions of adults responding for themselves—are consistent with this 
suggestion. Nevertheless, considering the differences for all groups in the light 
of differences in the per cent of adults responding for themselves, there was not 
clear-cut evidence that differential biases between the various age-sex groups 
would be expected to impair age-sex comparisons.’ 

As a result of this lack of conclusiveness in the evidence available, the extra 
cost of interviewing all adults for themselves was not considered a good invest- 
ment for the naticnal survey. Internal analysis of the data of the Charlotte and 
other studies suggested, however, that the number of errors of response might 
be reduced by the use of more restrictive rules for determining which persons 
were to be considered acceptable proxy respondents. It had been observed that 





2 It is of interest that in the California Health Survey [4], larger differences were associated with females not 
reporting for themselves than with males not reporting for themselves. Again, unfortunately, the scale of the in- 
vestigation was not large enough to give definitive results. 
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errors on the part of household respondents tended to cluster among those 
respondents not closely related to the person being reported for. Consequently, 
the following procedure was adopted: Information for an adult at home at the 
time of the interview is to be obtained from the person himself; for an absent 
adult only from the person’s spouse, if he is married, or from a parent or adult 
son or daughter residing in the household. Information for adults not related 


TABLE 74 


HOUSEHOLDS IN WHICH ADULTS FOUND AT HOME WERE 
TO BE INTERVIEWED FOR THEMSELVES AND ABSENT 
RELATED HOUSEHOLD MEMBERS 








Rate per Person 





Proportion 
Number of | of persons | Conditions (2-week prevalence) Days of 
persons in | interviewed Disability 
sample! for Major | Minor 
themselves | Total |Chronic| Non- | Non- 
Chronic| Chronic 





Total | Bed 





All ages 7 .958*| .784 .131 .093 .049 | .663 
Male .854 .636 .123 .094 -048 | .504 
Female »27 .056 .827 .137 .091 .050 | .618 


Under 18 .507 -247 .139 .120 -434 | .208 
Male .537 . 260 .154 -124 -471 | .222 
Female 445 474 . 234 .124 -117 -396 | .193 


18-34 597 .60 -886 663 . 136 .087 -868 | .457 
Male 256 35 .684 -500 -102 -082 -617 | .254 
Female 341 -79 .038 . 786 -161*; .091 .056 | .610* 


35-64 826 -62 -821*| 1.123 .123 O74 472 | .685 
Male 410 34 -154*| .963 -112 .078 .602 | .634 
Female 416 -87 -486 .281 .135 -070 .343 | .736 


65 and over 137 -66 2.095 | 1.956 096 -O44 .894 |2.650* 
Male 64 .52 1.938 | 1.844 .063 .031 -453 |2.734 
Female 73 .79 2.238 | 2.055 -123 055 -342 |2.575 





























1 Weighted counts adjusted for subsampling within segments. 
* Rate differs from corresponding rate in Table 73 by more than twice the estimated sampling error of the 
difference. 


to the head of the household is to be obtained from the person himself, or from 
a proxy respondent related to him as above, if one is living in the household. 
Information for children under 18 is obtained from a parent or other person 
responsible for their care. 

This procedure will be reviewed on the basis of data from the re-interview 
check carried out in the National Health Survey for subsamples of households 
and individual persoiis, the person himself being the respondent in the re- 
interview in all cases. The respondent rule is, of course, also subject to review 
as the content of the questionnaire is changed. 





HOUSEHOLD INTERVIEW DESIGN 75 


4. CONTROL OF ERROR AND BIAS ASSOCIATED WITH THE INTERVIEWER 


The National Health Survey is being carried out by a staff of about 140 
interviewers. Thus, the data for particular geographic sections of the country 
or large standard metropolitan areas for which separate statistics are to be 
provided may represent the work of only a very limited number of interviewers. 
In such a situation, particular attention must be given to the question of vari- 
ability and bias in the survey which may be associated with the interviewer. 

Although the first pioneering studies of interviewer variability were reported 
by Mahalanobis in 1946 [15] there have been relatively few studies of inter- 
viewer variance based on probability samples reported* and even fewer dealing 
with morbidity statistics [10, 12]. Since the Charlotte pretest included ran- 
domization of assignments, it was possible te try to obtain some information 
on this point. 

Estimates were constructed for the variance between interviewers and the 
variance within interviewers reflecting the sample design and assuming the 
interviewers in the survey to be a random sample [11, 6]. For this purpose the 
segments assigned to the supervisory staff were not included. Also, the assign- 
ments of two interviewers (one in each of two zones) were broken up during the 
course of the survey for assignment to other interviewers for administrative 
reasons and were eliminated from this analysis, and one interviewer assign- 
ment was eliminated at random from each of the remaining two zones to sim- 
plify the analysis. This left 20 interviewer assignments of seven segments each. 
To reduce the work to be done only half of the 120 degrees of freedom available 
for estimating the within-interviewer variances were used. 

Ratios of the estimated between- and within-interviewer mean squares are 
shown in Table 76 for the statistic “rate per person” based on all persons. To 
provide some illustration of the variability, the ratios are shown for each of the 
four zones separately as well as for the sample as a whole. These F-ratios pro- 
vide a test of the existence of between-interviewer variance. Also, the ratio 
(F—1)/F represents the estimated fraction of the variance of the statistic over 
repeated sampling that arises from between-interviewer variance. 

A review of the data in Table 76 shows some F-ratios that would be of 
interest for the problem of survey design, and a few of the F-ratios are large 
enough to be considered statistically significant if the tabulated F-distribution 
is applicable. (Values of the F-ratio less than unity are attributed to sampling 
error in the variance components.) Nevertheless, the data do not provide a 
simple picture of the operation of nonsampling error. The conclusions drawn 
will be indicated below, but first some further analyses which are of interest, 
although highly speculative, will be reported. The 16 F-ratios consisting of the 
four individual zone ratios for all items except total prevalence and bed dis- 
ability were considered as a single set, and compared with the tabulated 
F-distribution by means of the one-sided Kolmogorov-Smirnoff test [3]. (We 





3 For a review as of 1953 see Chapter VII in Hyman, et al. [13]. Some later work is summarized by Gales and 
Kendall [8]. 

4 The test we used requires the assumption that the observed F-ratios represent independent observations from 
a common probability distribution function. These items were excluded because of the correlation of total prevalence 
with other prevalence rates, and of bed disability with total disability. 





TABLE 76. F-RATIOS FOR BETWEEN- AND WITHIN-INTERVIEWER VARI- 
ABILITY FOR THE STATISTIC “RATE PER PERSON”: CHARLOTTE, 
N.C., PRETEST, NATIONAL HEALTH SURVEY, FEBRUARY 1957 

(Based on 20 interviewers randomized among the 4 zones, with 5 interviewers per 
zone; a sample of 140 segments, averaging an expected 6 households in size, 35 per zone, 
randomized within zone with 7 segments per interviewer; and the households done by 
each interviewer randomized equally between the 2 respondent rules.) 








Respondent Rule 





(1) (2) 


All adults | Adults found at home to 
to be inter-| be interviewed for them- 
viewed for | selves and absent related 
themselves} household members 


Differ- 
ence 
in rates: 


(1)-(2) 





All conditions! 


Chronic conditions 


Major nonchronic conditions 


Minor nonchronic conditions 


Total disability? 


Bed disability 


All zones o. 
1 


2 2. 

3 3. 

4 4 

All zones 
1 

2 

3 

4 


All zones 
1 


2 
3 
4 


All zones 
1 


2 

3 

4 

All zones 
1 

2 

3 

4 

All zones 
1 


2 
3 
4 

















‘ 2-week prevalence rate per person. 2 Days per person. 





. 
- Area 


Degrees of tresdem Tabular F-values 





in F-ratio 5% level | 1% level 





All sones 
Individual zone 





1.81 
3.06 


2. 
4. 


32 
89 
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were interested in testing only for shifts in the direction of larger F-values, 
since we regard a shift in the direction of lower F-values as a practical impossi- 
bility.) The results of these comparisons for rates for all persons and rates by 
sex, for each of the two respondent rules, are summarized in Table 77. These 
‘comparisons seem to suggest the existence of sources of significant nonsampling 
variance. 


TABLE 77 


MAXIMUM DIFFERENCE (D,) BETWEEN THE THEORETICAL F-DISTRIBU- 
TION AND THE OBSERVED DISTRIBUTION OF F-RATIOS 








D, for rates per person based on 





Respondent Rule 
All persons; Males | Females 








(1) All adults to be interviewed for themselves .26 .37 .21 
(2) Adults found at home to be interviewed for them- 
selves and absent related household members .24 ll .09 








PROBABILITY POINTS OF D, (n=16) (SEE REF. (16)) 








Probability (D, 2«) -10 -05 -025 01 





€ 26 .29 33 .37 

















The effects of these appear to be more marked when all adults are inter- 


viewed for themselves, and more marked for males interviewed for themselves 
than for females interviewed for themselves. Since ail the interviewers in the 
survey were female, this latter suggests that in a household morbidity survey 
a difference in sex between the interviewer and respondent may itself be a source 
of nonsampling variance. This speculation would also be consistent with 
Horvitz’s data from the first Pittsburgh Arsenal District Study [12]. We 
speculate that the difference between the observations for rates by sex with the 
two respondent rules may have the following explanation: We can expect a 
basic variability between interviewers even with a questionnaire and inter- 
viewing procedure as structured as that in the Charlotte pretest. However, 
the extent to which this will be reflected in the data may depend primarily upon 
the respondent. Thus, a respondent reporting about the health of a related 
household member is limited by his or her own knowledge regardless of the 
knowledge and skill of the interviewer. We think, also, that the respondent 
when uncertain about the existence of an illness or chronic condition for the 
other person will tend not to report it—again, regardless of the interviewer’s 
skill. These factors would be clearly relevant to the differences for males. In 
the case of females, a high proportion of whom responded for themselves even 
with the second respondent rule, we think that the differences may reflect 
greater respondent fatigue and consequent deterioration of response with the 
second respondent rule. We want to emphasize again the speculative nature 
of the preceding discussion. However, we feel that it has given us some sug- 
gestions for future research. 
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It was concluded that the evidence from the Charlotte pretest was not suf- 
ficiently clear cut to suggest modifying the proposed design of the National 
Health Survey. It did suggest the urgency of accumulating and analyzing 
further evidence on a continuing basis and the need for accumulating large 
amounts of data. Provision should be also made so that measures of sampling ° 
variance of statistics from the survey would properly reflect the effects of non- 
sampling variance.® 

In the Charlotte pretest the interviewers—all but a few without previous 
survey experience—were given a week’s training and then carried out their in- 
terviewing over a period of 2-3 weeks. For the national survey interviewers are 
being given training and are being assisted by supervisor observation, on a 
continuing basis. In addition, as noted above, a continuing check of the inter- 
viewing is to be carried out by means of independent re-interviews with a sub- 
sample of persons in the survey sample. This check will provide information 
both about interviewers and the quality of work on the survey as a whole. 
Within the limitations of the survey procedure, these steps are intended to help 
reduce potential interviewer and other nonsampling variance and bias. 


5. THE USE OF CHECK LISTS OF DISEASES 


In the National Health Survey of 1935-36 a list of 17 chronic diseases was 
read to the respondent and, whenever a member of the household was reported 
to have one or more of these diseases, the conditions were recorded. In nearly 
every illness survey since that time in which the information was collected at 
a single visit a procedure of the same sort has been used. 


The universal experience with such lists has been that they result in the 
reporting of numerous chronic conditions which apparently would not have 
been picked up otherwise. 

In the California Health Survey,’ for example, a check list was used following 
a series of other questions about illness, two of which were specifically devoted 
to ascertaining the existence of chronic conditions among members of the 
family. The results of the use of this list, consisting of 34 specific diseases or 
impairments, and finally “any other chronic condition,” were as follows: 

From questions preceding the check list came 56 per cent of the 31,302 chronic 
conditions picked up in the survey. Of these conditions reported from earlier 
questions 85 per cent were medically attended at some time in the past, and 
17 per cent had caused disability within the past four weeks. From the check 
list itself came 32 per cent of all chronic conditions; of these 72 per cent were 
medic: lly attended and four per cent had caused recent disability. The re- 
mainiiig 12 per cent of the reported chronic conditions came from questions 
following the check list and, in particular, from a check list of symptoms that 





* This involves primarily arranging to randomise the sample between interviewers in self-representing primary 
sampling units. 

* In some instances the check list device has been extended to include a list of important symptoms of illness 
im order to identify persons who might be suffering from chronic diseases for which no medica! attention had been 
sought. 

? The data quoted here and elsewhere in this paper from the California Health Survey are used with the kind 
permission of the California Department of Public Health. This statewide sample survey was carried out in 1954-55 
by the Department of Public Health with the assistance of the Bureau of the Census. It was partially financed by a 
grant from the National Institutes of Health, Public Health Service. 
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was read. Of these conditions 64 per cent had been medically attended at some 
time and six per cent had caused recent disability. 

Thus, the check list resulted in the reporting of considerable numbers of 
medically attended and a few currently disabling chronic conditions. 

Despite findings such as these concerning the productivity of the check list, 
there were serious doubts expressed about the use of this device. For one thing 
there was always a strong urge to increase the length of the list so as not to 
omit any disease of public health importance. In one survey, for example, a list 
of 51 chronic conditions and another list of 25 symptoms were used. Since the 
reporting of the diseases included seemed to be improved, it was hard to accept 
the exclusion of any particular one; yet it was recognized that too long a list 
might defeat its own purpose and lead to peculiar biases, among them an order 
effect [1]. 

To test the possibility of biases owing to the order in which the items were 
read, an experiment was carried out during the course of the Baltimore Health 
Survey.® This survey included, in addition to a 33-item chronic disease check 
list, a list of 12 selected symptoms which followed the longer list. During a 
10-week period the sample was split up so that the lists were read to some 
respondents in a reversed order. Ten random orders, a different one each week, 
were also introduced. Definite order effects were found for the symptoms, but 
the sample was not large enough to detect order effects in the chronic condition 
list. 

The question of whether to use such a check list was carefully considered in 
planning the present National Health Survey. It was believed that there might 
be superior methods of eliciting reports of chronic disease. Probe questions 
might be constructed that had the effect of defining chronic disease in terms 
of the kinds of actions people take as a result of their illness (seeing a doctor, 
taking medicine or treatment, changing work or other activities, etc.). Such a 
definition would be more consistent with the concepts in other parts of the 
survey, but it undoubtedly would bring about a substantial reduction in the 
amount of chronic disease reported, since many cases of chronic disease do not 
appear to be accompanied by significant action on the part of the individual 
affected. There would be compensating advantages if the new definition gave 
increased objectivity to the concepts and less bias. In any case it was finally 
decided that there was not sufficient time to develop and test an iniproved 
method before the start of the national survey. This problem was, therefore, 
postponed for future consideration and, for the time being, the check list 
approach was retained. 

In the Charlotte pretest a single list was used containing 31 items. This 
check list came at the end of a series of questions on illness and, as in the Cali- 
fornia survey, just after two general questions, dealing with chronic conditions 
and impairments. 

Again, despite its position in the interview, this question led to additional 
reports of conditions as indicated in Table 80. Note that only slightly smaller 





8 The Baltimore Health Survey was conducted by the Commission on Chronic Illness with the assistance of the 
Bureau of the Ceusus in 1953 and 1954. For a report on this survey see: “Chronic Illness in a Large City,” Volume 
IV of “Chronic IlIness in the United States,” published for the Commonwealth Fund by the Harvard University 
Press, 1967. 
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percentages of the conditions reported from the check list involved medical 
care, or days in bed in the past year, than for conditions reported earlier in the 
interview. 

The only change made for the national survey was to split the list into two 
parts, primarily for convenience in interviewing. One part is a list of 26 chronic 
diseases or groups of diseases. The other consists of nine impairments, such as 
impaired hearing, impaired eyesight, paralysis, loss of limb, and so forth. 


TABLE 80 


CHARACTERISTICS OF CHRONIC CONDITIONS REPORTED IN ANSWER 
TO THE CHECK LIST QUESTION! AND TO OTHER PROBE QUESTIONS, 
CHARLOTTE, N. C., PRETEST, NATIONAL HEALTH SURVEY, 
FEBRUARY 1957 


(Data are unweighted responses for 3,686 persons. The reading of the check list some- 
times brought out reports of conditions other than those read. Likewise, conditions named 
on the check list were often reported from other questions. The check list question was 
the last of the illness probe questions. The source was taken as the first question which 
resulted in the reporting of this condition.) 








Seen by a Not seen by 
doctor a doctor 





Type and source of report of No bed No bed 


z nd Bed dis-| ,.._,-) |Bed dis-| ,.. ,- 
l- 1- 
chronic condition ability disabi ability disabi 


in past in past 
year year 


ity in 
past 
year 


ity in 
past 
year 





Total Number 654 1,776 421 
Per cent 100 23 62 15 


Conditions named on the check list: 
Reported from check list question Number | 1,238 | 195 749 281 
Per cent 100 16 61 23 


Reported from other questions © Number 998 | 274 642 72 
Per cent 100 27 64 7 


Conditions not named on the check list: 
Reported from check list question Number 73 15 42 15 
Per cent 100 21 58 21 


Reported from other questions Number 574 170 343 8 53 
Per cent 100 30 60 1 9 




















1 “Has anyone in the family had trouble with any of these conditions during the past 12 months?” This was fol- 
lowed by the reading of a list of 31 chronic diseases and physical impairments. 


6. RECALL OF ILLNESS EVENTS IN A SPECIFIED PERIOD OF TIME 


The problem of recall of events in a specified period of time is not one that is 
confined to illness surveys. It has been studied in connection with marketing 
research, for example. Collins ef al. [5] used data from a study of illness in 
Cattaraugus County and Syracuse, New York, to show that incidence rates of 
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illness declined with an increasing length of interval between the reported 
month of onset of the illness and the interview, presumably because the re- 
spondent recalled fewer illnesses for those months that were more distant in 
time. Since questions in the monthly Sickness Survey of England and Wales 
covered each of the two months preceding the interview separately, Stocks 
[18] had an opportunity to compare statistics for the month just preceding the 
interview and the second month before the interview. Stocks’ data seemed to 
indicate that the drop-off for the longer period of recall was no less for the 
more serious illnesses than it was for the less serious. A study by Woolsey [20] 
of recall of sick leave taken by Federal employees, matched against official 
records, showed that the order in which three different time periods were 
covered in the interview made a difference. There was actually a tendency to 
overestimate sick leave taken in the period first asked about, regardless of 
whether that was a more recent or a more distant period, but the underestima- 
tion for the period asked about last in the interview was greater when that pe- 
riod was more distant in time than when it was a more recent period. There is 
also evidence to support Stocks’ belief that events recalled the by respondent 
are sometimes assigned to a more recent time period than the correct one. A 
telescoping effect takes place [9]. 

In the California Health Survey of 1954-55 respondents were asked about 
illnesses in the four calendar weeks prior to the week of interview, and the week 
of onset of illnesses starting during this period was also obtained. Since the 
survey was carried out over a period of 52 weeks with the total sample being 
randomized over all weeks, it was possible to tabulate the incidence of illness 
according to the number of weeks elapsing between the week of onset and the 
week of interview. Table 82a shows the incidence rates for each week in various 
categories of illness expressed as a ratio to the rate for the week immediately 
before the week of interview. 

It was apparent from these results that the recall was better for acute ill- 
nesses (including current injuries) than for episodes of illness caused by a 
chronic condition, and that it was better for disabling and for medically at- 
tended acute illnesses than for acute illnesses involving neither disability nor 
medical care. For episodes of chronic illness, however, neither the existence of 
disability nor the fact of medical attendance (a characteristic that applied to 
about two-thirds of all chronic episodes) seemed to help very much.® 

As a result of these findings it was decided to seek information in the Char- 
lotte pretest concerning illness for only the two calendar weeks prior to the 
interview. Furthermore, the California experience led to a decision, to be dis- 
cussed further below, not to attempt to count separate episodes of illness in the 
course of a chronic disease. Hence, Table 82b based upon the Charlotte pretest 
includes acute illnesses and current injuries only. 

The evidence from the pretest showed that there was no consistent drop-off 
for the second week before the interview as compared with the first, except 





* While the episodes of illness counted were those that began in the four weeks before the interview, the fact of 
medical attendance relates here to the chronic condition causing the illness. Hence, the most recent medical care 
was in some instances a year or more before the onset of the flare-up of illness. It should also be noted that the 
definition of disability in the California survey was somewhat broader than that used in the Charlotte pretest. 
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TABLE 82a 


INCIDENCE RATES OF ILLNESS FOR WEEKS PRIOR TO THE INTERVIEW 
EXPRESSED AS A RATIO TO THE RATE FOR THE WEEK 
IMMEDIATELY PRECEDING THE INTERVIEW. 
CALIFORNIA HEALTH SURVEY, 1954-55 








Number of weeks prior to Week of Interview 





Illness Category 1 2 3 





Ratio to Week 1 





Acute Conditions: 
All illnesses .00 
Doctor seen .00 
Disabling -00 
Doctor seen and disabling .00 
Doctor seen or disabling .00 
Doctor not seen, not disabling .00 


Chronic Conditions: 
All illnesses .00 
Doctor seen .00 
Disabling .00 
Doctor seen and disabling .00 
Doctor seen or disabling .00 
Doctor not seen, not disabling .00 

















possibly for illnesses involving neither medical care nor disability. Two weeks 
were, therefore, adopted as the recall period in the national survey for all ill- 
ness, injuries, and medical and dental care visits, except that for hospitalized 


TABLE 82b 


INCIDENCE RATES OF ILLNESS FOR THE SECOND WEEK PRIOR TO THE 
INTERVIEW EXPRESSED AS A RATIO TO THE RATE FOR THE 
WEEK IMMEDIATELY PRECEDING THE INTERVIEW. CHAR- 
LOTTE, N. C., PRETEST, NATIONAL HEALTH SURVEY, 
FEBRUARY 1957 


(Data are unweighted responses for 3,686 persons) 








: Number of weeks prior to 
Incidence rate per Week of Interview 
1,000 pop. in 
Illness category week prior to 1 | 2 
interview 


(Week 1) Ratio to Week 1 











All illnesses ; -95 
Doctor seen ‘ 1.11 
Disabling j .97 
Doctor seen and disabling ‘ 1.10 
Doctor seen or disabling ‘ 1.00 
Doetor not seen, not disabling ‘ .87 
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illness a recall period of a year was maintained. The two-week period was used 
for medical and dental care in order to avoid reference to another time period 
in the interview. Results of the morbidity research study conducted by the 
California Department of Public Health in San Jose demonstrated that hos- 
pitalized illness can be reported reasonably accurately for a period of a year [2]. 


7. DEFINING ATTACKS OF CHRONIC ILLNESS 


A matter of interest in describing the effects of chronic diseases is the acute 
phases, or attacks, or flare-ups, of illness that they cause. In several previous 
surveys attempts have been made to collect information about these attacks 
by the same procedures used for acute illnesses and injuries. In the five-year 
study in the Eastern Health District of Baltimore [5], in which visits were 
made to the families at monthly intervals, the periods of disability in the course 
of the disease were identified. But when the survey is limited to a single visit 
the attempt to make a distinction in the respondent’s mind between the onset 
of an attack or the onset of a period of disability, on the one hand, and the 
original onset of the chronic disease, on the other, has proven troublesome. One 
idea or the other can perhaps be explained, but not both. 

The existence of this difficulty became evident not only in attempting to 
train interviewers but also in listening to respondents in the interviews. More- 
over, even if the desired distinction could be made clear, it seemed that some 
disease conditions did not fit the preconceived pattern. Instead of exhibiting 
clearly definable attacks, they merely made the person feel a bit worse some 
days and a bit better other days. The respondent was hard put to it to say 
when the attack began. 

For these reasons it was decided to make no attempt to count the attacks 
or the periods of disability for chronic conditions in the basic questionnaire of 
the National Health Survey. Furthermore, the age at onset for all reported 
chronic conditions is not sought. Instead the effort is directed toward learning 
whether this chronic condition is one that first began to give trouble in the last 
year or not, the approximate date of most recent medical care, whether the 
condition is still under medical supervision, and the number of days in the past 
year during which the person has been kept in bed for all or most of the day 
on account of the condition. 

This pattern of questioning was tried out in the Charlotte pretest and seemed 
to work satisfactorily, except that the proportion of the chronic conditions re- 
ported to have been first noticed in the past 12 months was larger than one 
might have expected a priori. About five per cent were first noticed during the 
three months before the interview, another 12 per cent were first noticed more 
than three months but less than a year before, and 82 per cent were a year or 
more old, including about two per cent that were stated to have been present 
since birth. The incidence rate for new chronic conditions in the previous year 
among the living population at the time of the survey was about 130 per 
1,000 population per year. This high rate may indicate either that the date of 
of onset is not reported accurately or that some of the conditions reported as 
chronic are actually transitory. A Public Health Service study recently con- 
ducted in Hagerstown, Maryland, in which families were reinterviewed after 
an interval of a year, may throw some light on this question. 
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8. THE DESCRIPTION OF THE NATURE OF THE DISEASE, INJURY, OR IMPAIRMENT 


An early decision in planning the National Health Survey was to use the 
International Statistical Classification of Diseases, Injuries and Causes of 
Death for coding the diagnostic information. This was modified for the purposes 
of the survey by the substitution of a special code for physical impairments, 
devised by the Public Health Service, so that impairments could be grouped 
in one place and classified primarily by the nature of the impairment and 
secondarily by the etiology. 

The decision regarding the use of the ISC code immediately raised a question 
of improving the codability of the information secured in the interview. Train- 
ing interviewers to know what is an acceptable entry for the nature of the con- 
dition is the most difficult part of the training procedure in morbidity surveys. 
The space devoted to the matter in the interviewer’s manual in numerous 
surveys is evidence of the problem. Consequently, an effort was made to in- 
corporate some of the rules for required additional detail in the regular questions 
of the interview. The ISC code was reviewed and it was observed that the extra 
detail which would help to keep the condition from being thrown into “not- 
otherwise-specified” categories was principally of three kinds: 

1. The type of a particular disease, as for example, the type of heart disease; 

2. The cause of a particular condition, for example, the cause of a symptom 

such as “fever” or “diarrhea” ; 

3. The anatomic site affected, as for fractures or ulcers. 

It was recognized, of course, that the respondent could not always supply 
the extra detail, but it was considered desirable to make an effort to get the 
information needed for coding purposes in as many cases as possible. 

Accordingly, three extra questions were worked out for the part of the 
Charlotte interview where the nature of the condition was to be recorded. The 
interviewer first set down the initial description given by the respondent in 
answer to the question, “What was the matter?” and, in the case of a medically 
attended condition, the additional questions, “What did the doctor say it was? 
Did he use any medical terms?” The interviewer was then instructed to ask: 
“What kind of . . . trouble is it?” “What was the cause of .. . ?” and “What 
part of the body was affected?” whenever these questions were appropriate. 

It has turned out, however, that training the interviewers to know when 
these questions are appropriate is still the most difficult part of the training 
procedure. “What kind of heart trouble is it?” is a reasonable question; “What 
kind of rheumatic fever is it?” on the other hand, makes very little sense to a 
respondent. } 

The hand tally results shown in Table 85 illustrate the problems that be- 
came evident in the Charlotte pretest. In roughly 60 per cent of the cases all 
of the information needed for coding was obtained. In about 15 per cent the 
interviewer asked the right questions but the respondent did not know the 
answers or gave irrelevant or highly unlikely responses. In the remaining 25 
per cent one or more needed items were not sought by the interviewer. Of this 
last group the majority were cases that did not seem to have been covered by 
the interviewing instructions. 
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TABLE 85 


TALLY OF A SAMPLE OF QUESTIONNAIRES TO SHOW THE ADEQUACY FOR 
CODING OF DIAGNOSTIC INFORMATION, CHARLOTTE, N. C., 
PRETEST, NATIONAL HEALTH SURVEY, FEBRUARY 1957 


(Based upon a sample of 75 questionnaires. Data are unweighted tallies. Analysis ig- 
nores instances when unneeded information was obtained.) 











Number of 
medical 
conditions 





Total 219 
All information needed for coding obtained! 126 
Needed information sought by interviewer but apparently not known to 
respondent? 34 
Some item or items of needed information not sought by interviewer Total 59 
Judged to have been covered by interviewing instructions 16 
Judged not to have been covered by instructions 40 
Indeterminate 3 
Information missing had to do with: 
Type of condition 
Cause of symptom or impairment 
Anatomic site 
Nature of present trouble for old injury 
Other information missing 





1 Without regard to whether information was obtained in the prescribed manner. 
3 Includes instances where the right questions were asked but the respondent gave irrelevant or medically un- 
likely responses. 


The Charlotte results were helpful in revising the interviewer’s manual prior 
to the start of the national survey and in preparing training materials. Further- 
more, numerous instances were observed in which the extra questions elicited 
enough additional information to permit assignment of a more specific code. 
With minor changes the procedure used in the Charlotte interview was carried 
over to the final questionnaire. Nevertheless, the problem of the adequacy of 
diagnostic entries continues to be a troublesome one. Further research is being 
carried out in the national survey on the following questions: (1) Is the degree 
of detail in which this coding is done consistent with the ability of the respond- 
ents to supply specific descriptions of the nature of the family’s ailments, as 
given to the family by the physician? (2) To what extent will two coders inde- 
pendently arrive at the same ISC code for a given entry on the questionnaire? 


9. DISCUSSION 


The foregoing does not by any means cover all of the efforts that were made 
to improve the morbidity survey techniques for the National Health Survey. 
At many other points in the interview concepts were made more objective to 
reduce errors of response. While there is much work still to be done, it is believed 
that there has been progress made toward more objective working definitions 
for: a day of disability, a bed day, medical attendance for an illness, a hospital- 
ized illness, a medical service, a case of blindness, and several other items. 
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In all of this work the principle followed has been to seek more objective 
operational definitions in order to reduce response variability, even if this has 
to be done at the cost of some arbitrariness. 

Furthermore, more than the usual proportion of resources have been put 
into organization of the interview and the field work in such a way as to reduce 
response error and bias. Examples that have been mentioned are: the use of 
more restrictive rules for determining which persons are to be considered ac- 
ceptable respondents, and the reduction of the recall period from four weeks 
to two weeks. The question of whether the same money spent on increasing the 
sample size would give greater precision is a complicated one to which an 
answer will not be attempted here. It depends, of course, upon whether one is 
searching for an optimum design for national estimates or for regional estimates, 
for estimates for all ages combined or for detailed age groups, and so forth. 
Until there are more data available and there is more experience with use of the 
survey results, decisions on questions of that kind must be based largely on 
judgment. 


10. SUMMARY 


This paper deals with certain problems inherent in the measurement of 
morbidity by means of interview surveys. These are problems regarding which 
decisions had to be made in planning the household survey for the National 
Health Survey Program. The decisions were based as far as possible upon 
evidence from earlier surveys and data from a pretest conducted in the standard 
metropolitan area of Charlotte, North Carolina. 

Data from earlier surveys and from the Charlotte pretest are presented to 
show how the decisions came to be made on: (1) rules for an acceptable re- 
spondent in the household; (2) the treatment of error and bias associated with 
the interviewer; (3) the use of certain types of “probes” in the interview; (4) 
the problem of recall of illness events in a specified period of time; and other 
questions. 

In ali of the planning the principle followed was to seek more objective opera- 
tional definitions, even sometimes at the cost of some arbitrariness, and also 
to go to extra expense, when necessary, in the organization of the interview 
and field work in order to reduce response error and bias. 
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OCONOaAhwWnDe 


When estimating the mean value of a quantity z, in a population to 
be divided into Z strata according to the value of a quantity closely 
correlated with z, it is necessary to choose the L —1 points of stratifica- 
tion. Nearly optimum points are obtained if they are chosen to equalize 
the integrals over the various strata of the square root of the population 
density. A simple method for the iterative improvement of the points is 
given and illustrated on several examples. 


1, INTRODUCTION 
1.1 Four specific design operations. 


The use of stratified sampling involves four specific design operations: 

(a) the choice of a stratification variable; 

(b) the choice of the number L of strata; 

(c) the determination of the way in which the population is to be stratified; 
; and 

(d) the choice of the size m, of the sample to be taken from the hth stratum. 

In the present paper, we will discuss problem (c). The emphasis is on theory 
only; in par. 1.2.1 we indicate some aspects of great practical importance but 
not dealt with here. 


1.2 Review of different rules for stratification. 

1.2.1 An exact solution. Dalenius [2] used an approach which provided an 
exact solution to the problem as treated. This approach will be reviewed here, 
since it is used in paragraph 2, of the present paper. 

Dalenius considers a density f(x) with mean 


ae f “y(bat (1) 


The range xo, x, of the estimation variable x is cut up into L parts at points 


88 





MINIMUM VARIANCE STRATIFICATION 89 


Mm< +++ <ta<+++ <az-1. Each such part corresponds to a stratum; i.e. 
the estimation variable z is used as stratification variable as well.! For the Ath 


stratum 
rh 
Wi, -f f(Hadt 
Th-1 


Th 
Wien = f tf(d)dt 


h-l 


f tf (t)dt 


Th-1 
= 2 
—_——_———— — pa 
W,, 1. 


Obviously 
= 2) Winn (5) 
h 
A sample of n= >>, ms observations is selected from f(x) and y is estimated by 


2 z Wir (6) 
h 


This estimate has variance 


1) = F we — (7) 
h 


np 


It is well-known that the variance is minimum when using the Tschuprow- 
Neyman allocation, i.e. when n, « Wyo. Then the variance equals 


Omnin( (Z) a —(x Rt Warn) (8) 


This variance is a function of the points x, of stratification. Dalenius [2] 
demonstrated that the set [z,] of cutting points satisfying the relation 


on? + (tn — mn)? = o7 nga + (Te — Margi)? 





(9) 
oh Th+1 
corresponds to minimum variance stratification, MVS. 

We wish to stress the fact that the approach discussed here is effective in 
the sense of mathematical theory. In practical sample design work, one does not 
have a nice density f(x) with which to work, but is restricted rather to some 
measure of size from the past. Moreover, in practice, such considerations as 
costs and the breadth of the optimum arise. 

An interesting application of Eq. (9) is discussed by Strecker [9, Sec. 
223.122]. 





! Dalenius and Gurney [4] discuss the case with a specific stratification variable. 
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The solution given by Eq. (9) presents computational difficulties when ap- 
plied in a general case. For the special case where f(z) is the normal density 
g(x), Zindler [10] gives a useful computational procedure. In par. 2 of this 
paper, a simple, generally applicable method of approximating the exact set 
[z,] is presented. 

1.2.2 A conjecture. Dalenius and Gurney [4] conjectured that 


Wien = constant (10) 


would serve as an approximation to Eq. (9), when L is “large.”? 

1.2.3 A rule of thumb. The observation that for many populations, and for 
reasonable locations of the stratum boundaries, the relative variance does not 
vary much from stratum to stratum, leads from Eq. (10) to 


Wu,’ = constant (1la) 


where y,’ refers to a measure variable which is assumed to be highly correlated 
with the estimation variable, as discussed by Hansen et al. [6, p. 219]. For the 
special case when these variables are identical, the rule is 


Wu. = constant (11b) 


Mahalanobis [8, p. 4, footnote] proposes a similar rule: “... an optimum 
or nearly optimum solution would be obtained when the expected contribution 
of each stratum to the total aggregate value of x is made equal for all strata” 
(p. 4, footnote). 

Kitagawa [7, pp. 27-30] analyzes this “principle of equipartition” in order to 
justify it and presents some related results. 

1.2.4 Equidistant stratification. The normal kind of first step to try when 
setting up strata is often to make 


Liu — L. = constant (12) 


Aoyama [1] derived this result, by applying the mean value theorem to 
Eq. (9), and assuming that the variation of the density is small in each stratum. 


Dalenius and Hodges [5] analyze this rule in some detail. 


2. AN APPROXIMATION TO THE EXACT SOLUTION 
2.1 The approach of this paper 


This paper will make use of the same approach as the one used by Dalenius 
[2}; the crucial point is that the population is represented by a density f(z). 
As a consequence, the results arrived at are of the same theoretical nature as 
those discussed in par. 1. 


2.2 First approximation 
We introduce the transformation 


y(s) = f " VF@at (13) 





2 In practical sample design, it is important to realize that the big gain from the use of one-stage stratified 
sampling is obtained in going from L =1 to L =2. This point is illustrated by Dalenius [3, Ch. 8]. 
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U 


When u— ~, y(u) approaches an upper bound H. The roots x)’ - - + a + + a’ r-4 
of the following equations 


h 
= —H, -++-L—1 (14 
y(u) L (14) 


are taken as the (first) approximations, for large L, to the points x - - 
tr-1 satisfying Eq. (9). 


2.3 Justification 

This approximation may be derived by the following heuristic argument. 
When L is large, the strata will be narrow, and each will have an approximately 
rectangular distribution, so that 120,+2,—2,1. Then, by the mean value 
theorem there exists a value f, of f in the hth stratum, such that 


V12 Wien = Do [A filtn — 2n-1)]? = Dolyn — yaa]? (15) 


The last sum is minimized by making y,—y,-1=constant. A rigorous proof has 
been given by Dalenius and Hodges [5]. 


2.4 Adaption to numerical calculations 

We assume that the density f(x) is stratified into L strata. Two consecutive 
strata are specified by x,_1, x, and 24:. In order to simplify the formulas, we 
will denote these points of stratification by x,1=2,, 2, =2p and ay =2j. 

The interval z,, x, corresponds to the Ath stratum, and x, z; to the ith 


stratum. 
We define 


T,(u) =f t?f(t)dt (16) 


—.2 


The conditional means ya, u; and variances o,?, o;? of the two strata are easily 
expressed in terms of J,(w) 


J, 0% hey ~ tes 


ep Teen) — Tale) 





fi, efoat |, Tales) — Lay) 


pat 





ee: or ee = = 
- ws I,(2n) nie T,(x4) 
f soa 
zg 


Moreover, we define 
Jpn = Ip(tn) — I,(x,) 
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(20) 


(21) 


The condition given by Eq. (9) may now be expressed in terms of J,, and 
J >: a8 follows 


Jn — 2rd t+ atnrJon Ja — Qari + CHT os 
Vd ond on — Jin? VJ ode: — Ji? 





0 (22) 


For simplicity, this expression will be written 
A,—- Bh = 4 =0 (23) 


The set [z,] satisfying Eq. (23) corresponds to MVS. If we substitute any 
other values, say 2’, we will denote the left and right side of Eq. (22) by A,’ 
and B,’ respectively and the difference by Ay’. 


2.5 Second approximation 
In general, we may not expect the set [z,’| derived from Eq. (14) to satisfy 
Eq. (23). Thus, there is need for some method for adjusting the initial set 
[x,’] into a set [x,’’] which then can be checked in Eq. (23) etc. We will present 
such a method here. 
Consider a rectangular distribution f(z) =1. With no loss of generality, we 
may assume 0<2z<b. For this distribution we have: 
I o= J q = b 
I, = J; = 40? 
I = J $= 4b3 
Substituting these values in 
J2 — 2bJ; + bo 
VJoW2 — Ji? 





which is A, for h=1, we get 


2 
A =—=b~1.15b (26) 
V/3 


ie. A changes at about 1.15 times the rate at which the interval length (0, b) 
changes. 

As the next step, consider that this rectangular distribution is divided into 
L=2 strata, at a point 2’, 0<2,' <b, arbitrarily chosen. This point 2,’ specifies 
Ay’ and B,’. Applying the result quoted above, we realize that changing 2,’ by 
one unit will change A,’ and B,’ by approximately 1.15 units each and the 
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difference A,’ by approximately 2.3 units. It seems reasonable to determine a 
second point 2,’ of stratification by setting A,’’ equal to zero, where 


4 
Ay” = Ay’ aa (ar,’ — x1’) V3 (27) 


The solution of A;’’=0 is given by 


a," = 2 — — A)’ (28) 
4 
The point 2z:’’ may reasonably be expected to be superior to 2;’ as an ap- 
proximation to the MVS point 2. 
We will now generalize this to any number L of strata. For three consecutive 
strata, with indices g, h and 7, we have (in the rectangular case) 
0A,’ OB,’ 2 


0x,’ Ox! VJ/3 


(29) 
aA, eB! 2 | 


O2,! S Oz,’ V3 } 


while all expressions of the type 0A,'/dz;’, 9B,’/dx,' etc. are equal to zero. 
From A,’ = A,’ —B,’ we derive analogous expressions for 0A;/dz;,’ etc. 
These values will now be used in the following way. We have a set [z,’] 
with corresponding A,’-values. We want to find a set [x,’’] with | Ax’’| < | Ay’ | ‘ 
By the mean value theorem we have 
0A,’ 0A,’ 
A,” = A,’ + (2." -= 2) oa + (a ”’ - rn’) 


Xo rp! 


dA,’ 


+ (x,"’ nai z;') pt 


» BOD ghee Bed, OD 
0A, 


where we put to =20' and x,"’=2,’. Solving this system for z,’’ gives the set 
wanted. 
The matrix M of 0A,’/dz,’ is under the approximated rectangular distribu- 
tion 
—1 
2 
0 -1 














0 0 0 cee 2 


the matrix being a square matrix with L—1 rows and columns. 

In the first and last rows we assumed the rectangular distribution with finite 
range which is limited at one end by the point of very large absolute value. The 
more the number of L, the better the approximation by the rectangular dis- 
tribution. The inverse of M is 
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L-1 L=+-2 -;-:: 2 1 
L-2 2L-—2) ---: 4 2 
6 3 

2 4 ++: 2L-—2) L-2 
1 2 -++ (L-2) L-1 











Thus, the new set [z,’’] is found by computing 


-~ 4 a — yay + (L—2)A’ +--+ +A'r4] a 
il | 
| 


2 L 
-Aw-meentones: » + 2A! 
2 L — 11! (33) 


V3 
ep = ei OL [Ay’ + 2A’ +---+(L — 1)A’'1- i] 


If necessary, the procedure may be repeated to give a third set [:,’’’] etc. 

As the final step in our adjustment procedure, we now pretend that the pro- 
cedure discussed above for the rectangular case, holds reasonably well also for 
other cases. 

In the following paragraphs, we will apply this procedure to some densities. 

3. APPLICATION I: f(x) =e7* 

3.1 Derivation of y(u) and I,(u) 

From the definition of y(u), we get 


y(u) -{ veu = 2(1 — e-“/) 


with y(0) =0, y(o) =2. 
Moreover, 


T,{u) -f t?e~ ‘dt 


Evaluating the integral gives 

Io(u) =1-e™" 

I,(u) = 1 — e* — ue 

I(u) = 2 — e~*(u? + 2u + 2) 
3.2 Numerical calculations 


3.2.1 L=2 strata. The first approximation 2,’ is given by the root of the 
following equation 
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—- 1 —- 
f viwu - al f via | 


1 
2 


or in this case 


2(1 — e-“) = —2=1 


giving x;’=1.39 correct to two decimal places. 
Using the J,(u) functions, we now compute’ the following Table 95. 


TABLE 95 


DETAILS OF NUMERICAL CALCULATIONS OF FIRST APPROXIMATION 
TO THE OPTIMUM POINT OF STRATIFICATION OF f(z) =e* FOR L=2 








Zo’ =0 a’ =1.39 Xo! == «© 





0.0000 0.7509 1.0000 
0.0000 0.4047 1.0000 
0.0000 0.3280 2.0000 








We compute 
A,’ = 2.2761 
By = 2.0000 
Ay = 0.2761 


As Ay’ is considerably larger than zero, we have to adjust 2’. We thus com- 
pute 


(39) 


giving x)" =1.27. 
We now repeat the computations contained in Table 95. From the resulting 
T,(z:'")- and J’’-functions, we compute 
A,” = 2.0162 
By’ = 2.0007 


Ai” = 0.0155 





3 The computations have been carried out at the Institute of Statistics, University of Stockholm. 
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Thus, A,’ is much closer to 0 than is A;’. Therefore, z:’’ should be closer to 
the best point x; than is 2z,’. In fact, as shown in par. 3.3, the variance corre- 
sponding to z;"’ is less than the variance corresponding to 2)’. 

3.2.2 L=8 strata. The first approximations 2,’ and 2,’ are given by the 
roots of the following equations (cf par. 3.2.1.) 

1 
2(1 — e*) = —2 
(40) 


2(1 — e*) =—2 


giving z,’=0.81 and z,’=2.20. Using these values, we carry through the com- 
putations necessary to obtain z;", z2’’, and the variances for these values. The 
results are shown in Table 96. 

3.2.3 L>8 strata. Similar calculations have been carried out for L=4 and 
L=5 strata. As they are entirely analogous to those reported above, no details 
will be given. The results are summarized in Table 96. 


3.3 Summary table of numerical results 


In the following Table 96 the numerical results are summarized. 


TABLE 96 


FIRST AND SECOND APPROXIMATION TO OPTIMUM POINTS OF 
STRATIFICATION OF f(z)=e* FOR L=2---5 STRATA AND 
THE ASSOCIATED VARIANCES 








Points of stratification 





Number 
of First appr. = 22’ j Variances 


strata 





” 


Second appr. = Xa 





First 





Second 








First 





Second 








First 





Second 





First 























Second 
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4. APPLICATION I1: f(x) =xe-* 
4.1 The I,(u) functions 
For this density, we get 


Io(u) = 1 — e-* — ue 
I(u) = 2 — e[u? + 2u + 2] 
I,(u) = 6 — e[u® + 3u? + 6u + 6] 


4.2 Summary table of numerical results 


In the following Table 97, the numerical results are summarized. 


TABLE 97 


FIRST AND SECOND APPROXIMATION TO OPTIMUM POINTS OF 
STRATIFICATION OF f(z) =x e* FOR L=2---5 STRATA 
AND THE ASSOCIATED VARIANCES 








Points of stratification 





Number 
of First appr. = 2! cP Variances 
strata 





Second appr. = z,"’ r 





First 





Second 





First 





Second 





First 





Second 





First 























Second 





5. APPLICATION Ill: f(x) =2¢(x), O<x 
5.1 The I,(u) functions 
For the right half of the normal density, we get 


Io(u) = $(u) 
I,(u) = — f(u) 
I2(u) Io(u) + ul,(u) 
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5.2 Summary table of numerical results 


In the following Table 98, the numerical results are summarized. 


TABLE 98 


FIRST AND SECOND APPROXIMATION TO OPTIMUM POINTS OF 
STRATIFICATION OF f(z) =2¢(z), O<z, FOR L=2---5 
STRATA AND THE ASSOCIATED VARIANCES 








Points of stratification 








First appr. = F Variances 





Second appr. = 


First . | 0.02752 








Second ‘ 0.02729 














First , ‘ 0.01301 





Second . ; 0.01286 














First , 9 é 0.00749 





Second ? ‘ , 0.00751 








First ‘ , : ; 0.00489 























0.00482 





6. APPLICATION Iv: f(x) =2(1—z2z) 
6.1 The I,(u)-functions 
The J,(u)-functions are 


Tu) = 2u — u? 


T,(u) u? — -— us 


2 
T,(u) — 4? — — »! 
3 2. | 





6.2 Summary table of numerical results 


In the following Table 99, the numerical results are summarized. 


7. COMPARISONS OF DIFFERENT COMPUTING TECHNIQUES 


For the research reported by Dalenius [3, Ch. 7], a very sizable amount of 
computation was performed in order to find approximations to MVS. These 
computations, performed by a professional computer of outstanding capacity, 
were of a “trial-and-error” nature. 
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TABLE 99 
FIRST AND SECOND APPROXIMATION TO OPTIMUM POINTS OF 
STRATIFICATION OF f(z) =2(1—z) FOR L=2---5 STRATA 
AND THE ASSOCIATED VARIANCES 








Points of stratification 





Number 
of First appr. = 
strata |— 
Second appr. 


, , 


Le Xs Variances 





ta 





First 


| 
~- 
| 
| 





Second 








First 

















Second 














First 


Second 


























By comparison, it is found that the new technique demands only a fraction 
of the time used with the old technique, to produce sets [:;’] having variances 
substantially equal to the variances computed earlier. 

8. COMPARISONS OF DIFFERENT RULES FOR STRATIFICATION 
8.1 Comparison with the rule of thumb given in par. 1.2.3 


The rule of thumb given in par. 1.2.3 amounts to finding a set [z,’] such that 
Wiva=constant C for h=1--- L. This condition is equivalent with 


T(2’1) — h(a’1) = C) 
Ty(2'p-1) — [1(x' 1-2) C| 


Iy(2'1) — Ti(a’o) = C) 
where /;(x,') =1,(u) for u=z,’. Adding the L equations we easily get 


ethane 
C= ry T(x L) 3° 


4 4 


as the value of C. From this, we derive the following result 


I(a' 1-1) —. H| 


I, (x" r-2) = 
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etc., from which it is easy to compute the set [z,’] satisfying this rule of thumb 
Using this set [x,’], the variances may be computed and compared with 
those arrived at by the approximate rule presented in this paper. 
Such comparisons have been made for f(z) =e~* for L=2, 3, 4 and 5 strata. 
Thus, for e.g. L=5, we get the set [z,’] by solving the equations 
- — ze~* = 0.20 
- — ze* = 0.40 
_ — ze~* = 0.60 
- — ze~* = 0.80 


(47) 


The solution z;’ - - - x’ to these equations is given below together with the 
corresponding first approximations derived by means of the technique pre- 
sented in this paper. 


TABLE 100a 


COMPARISON OF POINTS OF STRATIFICATION OF f(z) =e-* FOR L=2---5 
STRATA AS COMPUTED BY THE RULES GIVEN IN PAR. 1.2.3 AND PAR. 2.2 








According to rule given in 





Points of stratification 
Par. 1.2.3 Par. 2.2 





0.00 
0.82 
1.38 
2.02 
2.99 


0 














The corresponding variances are given in the following table. 


TABLE 100b 


COMPARISON OF VARIANCES IN STRATIFIED SAMPLING FROM f(z) =e 
FOR L=2---5 STRATA AS COMPUTED BY THE 
RULES GIVEN IN PAR. 1.2.3 AND PAR. 2.2 








Variances when using rule given in 





Par. 1.2.3 





0.1556 
0.0950 


0.3079 | 
| 
0.0639 
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8.2 Comparisons with equidistant straiification 

Equidistant stratification is applicable only to a density with a finite range. 
In order to throw some light on its efficiency, it has been applied to f(z) 
=2(1—-); the resulting variances are compared in Table 101 with those result- 


ing from using the first approximations derived by means of the technique given 
in par. 2.2. 


TABLE 101 


COMPARISON OF VARIANCES IN STRATIFIED SAMPLING FROM 
f(z) =2(1—z) FOR L=2---5 STRATA AS COMPUTED BY 
THE RULES GIVEN IN PAR. 1.2.4 AND PAR. 2.2 








Variances when using rule given in 





Par. 1.2.4 Par. 2.2 





0.0181 0.0152 
0.0091 0.0069 
0.0046 0.0044 
0.0038 0.0029 
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HOW MANY OF A GROUP OF RANDOM NUMBERS WILL BE 
USABLE IN SELECTING A PARTICULAR SAMPLE?* 


Howarp L. Jonrs 
University of Chicago 


When a sample is selected from a finite population by employing 
random numbers, certain numbers may have to be discarded as not 
being usable. In the first place, some of the random numbers may not 
correspond to serial numbers in the population. In the second place, 
some random numbers may be duplicates of others. The number of us- 
able random numbers remaining is a random variable with a probabil- 
ity distribution. Exact formulas for this distribution, and for the fac- 
torial moments of the difference between the number remaining and the 
population size, are derived and discussed. Approximations to the cumu- 
lative probability distribution are also suggested, and investigated for 
special cases. These approximations are of some importance in minimiz- 
ing costs where high-speed computers are used in selecting or generating 
random numbers, and the initial selection of too many or too few num- 
bers causes trouble and expense. The number of usable random numbers 
corresponds to the number of occupied cells for some subclass of an 
entire class of cells among which balls or other objects are randomly 
distributed. For the special case where all random numbers in a set cor- 
respond to serial numbers in a sampled population, the problem of pre- 
dicting the number of usable random numbers is statistically equivalent 
to the classical occupancy problem. The problem discussed here is not 
to be confused with the closely related problem of predicting how many 
successive random numbers must be selected before the number of us- 
able random numbers agrees with some previously specified number. 


INTRODUCTION 
1. The Problem 


upposE a file of 18,000 serially numbered cards includes 15,965 cards that 
S refer to customers’ accounts, one account per card. Suppose we wish to 
select a sample of these accounts by employing a table of random numbers. We 
might proceed as follows. 

First, we might begin at a haphazardly chosen place in the table, and copy a 
series of five-digit random numbers, interpreting the first digit as 1 if it is odd, 
and 0 if it is even. Thus, the random digits 35679 would be interpreted as 
15,679; the random digits 62908 would be interpreted as 2908. Next, we might 
arrange the copied numbers, as interpreted, in order of size, excluding dupli- 
cates and numbers in excess of 18,000. Finally, we might match the remaining 
numbers with the serial numbers in the card file, and select those matching 
cards that refer to customers’ accounts. The related customers’ accounts would 
then be taken as our sample. Procedures similar to this are frequently employed 
in many business organizations. 





* This paper was presented August 28, 1958, at the annual meeting of the Institute of Mathematical Statistics, 
under the title “The Number of Occupied Cells of a Particular Subclass (When Objects are Assigned to Cells at 
Random).” It is based on research carried on while the author was with the Illinois Bell Telephone Company. 


102 





HOW MANY RANDOM NUMBERS USABLE? 103 


In carrying out a procedure of this kind, it is sometimes convenient to use 
tabulating cards with prepunched random digits, A file of such tabulating cards, 
equivalent to its published table [8] of a million random digits, can be obtained 
from The RAND Corporation in Santa Monica, California. Each such card 
has 50 random digits, in addition to a serial number. In using a file of these 
cards to select a sample like the one just described, we might first cut the deck 
of cards in haphazard fashion and interchange the two ends as in cutting play- 
ing cards. We might also haphazardly select five consecutive columns of random 
digits as the columns to be used for our purpose. We might sort on the first two 
of these five columns, and then eliminate cards for which an odd digit in the 
first column is paired with,an 8 or 9 in the second column. The remaining cards 
for which the first digit is odd might then be combined and sorted on the last 
four digits. Likewise, the cards for which the first digit is even might be com- 
bined and sorted on the last four digits. During this sorting procedure, cards 
for which the last four digits agree with a previously sorted card might be 
automatically eliminated. A list of the five-digit random numbers on the re- 
maining cards might then be printed, with even and odd first digits interpreted 
as 0 and 1, respectively. We could then select a sample of customers’ accounts 
by matching the random numbers on the list with the serial numbers in the 
ecard file of these accounts (00000 being interpreted as 18,000) and discarding 
each listed number that does not correspond with a customer’s account. The 
tabulating cards could be restored to their original order by sorting on their 
serial numbers. 

The procedure just described is fairly simple, and statistically acceptable. 
But it requires the selection of more random numbers than are finally used in 
selecting the sample. In the situation just described, if we initially select 1000 
random numbers, we shall discard 100, more or less, because they correspond 
with serial numbers larger than 18,000. Some more random numbers will be 
discarded because they are duplicates of other selected numbers. Finally, some 
random numbers will be lost because they match cards in the accounts file that 
do not actually refer to customers’ accounts. 

If the random numbers are chosen by processing tabulating cards with 
random digits, the problem arises as to how many cards to process initially in 
order te obtain a sample of a specified size. Selecting too few cards will require 
the selection of additional cards and intersorting them either manually or by 
machine in their proper places. Selecting too many cards may require the 
elimination of the excess, as by resorting in the original serial number order 
and then eliminating cards with the highest serial numbers, or by some pro- 
cedure that amounts to renumbering the cards selected and then randomly 
choosing some of the new numbers to designate the cards to be eliminated. The 
initial selection of too many or too few cards may thus require an additional 
operation that is somewhat expensive and time-consuming. 

Unfortunately, when the random numbers are initially selected, we can not 
determine precisely how many numbers will be usable, The reason is that the 
number of usable numbers is a random variable. The probability distribution 
of this variable, denoted by the letter s, depends on three parameters, which 
we denote by the letters 7, n, and N. Here, 7 denotes the size of the subclass 
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of individuals or items that we wish to sample, which we shall refer to as the 
“target subclass;” n is the size of the group of random numbers initially selected ; 
and N might be described as the size of the entire class or universe from which 
the initial selection is made. For the example we have been discussing, 
T = 15,965, n= 1000, N = 20,000. We assign N the value of 20,000 (not 18,000 
or 15,965) because the 1000 random numbers initially selected were chosen 
from a set of numbers that ranged, in effect, from 00000 to 19999, after the 
first digit in each such number had been interpreted as 0 or 1. We would also 
put N = 20,000 if we were to follow the possible alternative procedure of initially 
selecting 1000 five-digit random numbers from 00000 to 99999, then finding the 
remainders after dividing the numbers 00000 to 89999 among the selections by 
18,000, and finally matching these remainders, after excluding duplicates, 
against the serial numbers of the customers’ accounts we wish to sample (in- 
terpreting 00000 as 18,000). 

The problem discussed here consists essentially in computing or approximat- 
ing the distribution of the random variable s. The first moment of the distri- 
bution can be used to estimate the number of usable random numbers remain- 
ing. The cumulative distribution can be used to compute limits within which 
this number may be expected to fall in a given situation. 


2. The Occupancy Problem 


Suppose that N cells, serially numbered from 1 to N, belong to some specified 
class, and that 7’ of these cells belong to some subclass, which we call the 
target subclass, in which we are particularly interested. Suppose that n balls 
(or other objects) are distributed at random over the N cells in the whole class, 
by drawing nm random numbers from 1 to NW and assigning the n balls to the 
cells for which the serial numbers correspond to the random numbers drawn. 
Then the number of cells occupied by one or more balls will obviously be equal 
to the number of different random numbers selected; and the number of oc- 
cupied cells in the target subciass of size T will evidently be equal to the 
number of different random numbers included in the n selections that cérre- 
spond to serial numbers for this subclass. Thus, the probability that s of the T 
cells in the target subclass will be occupied is precisely the probability that s 
of n numbers, randomly selected with replacement from the integers 1 to N, 
will be usable in selecting a sample of the cells in the target subclass. 

The special case where 7’ = N was discussed by W. L. Stevens [10], who gave 
the probability distribution of the number of occupied cells, and derived the 
moments of this distribution. The equivalence of this case to a random number 
problem has been indicated by Feller [3, p. 64 ff. ]. 

The cases where 7 =N=n and T=}N=n have been discussed by F. N. 
David [1], who refers to the mathematical treatment by Laplace and to earlier 
papers by others. Both Stevens and David were interested in testing hypotheses 
about frequency distributions. As David points out, the usual chi-square test 
of goodness of fit, which is only approximate, is not very satisfactory when 
the sample size is small. To overcome the difficulty, she proposes that the 
theoretical distribution (under the null hypothesis) be divided into a number 
of strata with equal probabilities. A sample is selected from the actual distribu- 
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tion to be tested, and the number of “occupied” strata in the theoretical distri- 
bution is counted. By computing the probability of having as few (or as many) 
occupied strata as the number observed, an exact test of goodness of fit can be 
made. In testing against certain hypotheses, David suggests designating one- 
half of the total number of strata as the subclass of interest, and basing the 
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test on the number of occupied strata in that subclass. 

The results for the general case discussed in the present paper, where the 
size of the subclass may be any integer from 1 to N, may be regarded as an 
extension of the results obtained by Stevens and David. 
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EXACT FORMULAS 


4. The Probability Distribution 


The following notation will be used. 








Symbol | 


Random Number Problem 


Occupancy Problem 








Number of serial numbers (1 to N) for entire 
class sampled 


Number of serial numbers belonging to target 
subclass 


Number of random numbers selected with re- 
placement from numbers 1 to N 


Number of random numbers selected that are as- 
sociated with target subclass 


Total number of random numbers selected after 
duplicates have been eliminated 


Number of random numbers selected, after elimi- 
nating duplicates, that belong to target subclass 





Total number of cells in en- 
tire class 


Number of cells belonging 
to target subclass 


Number of objects assigned 
at random over all N cells 


Number of objects assigned 
to cells in target subclass 


Total number of occupied 
cells for entire class 


Number of occupied cel!s in 
target subclass 





The probability distribution of m may be written 


Nl kann 


O(m|n, N) = NN — ml 


m 


kaa = 


r=0 


(—1)"(m — r)" 


ri(m — r)! 





(4.1) 


(4.2) 


This is equivalent to Stevens’ [10] formula for the probability that m cells will 
be occupied if n objects are randomly distributed over N cells. 
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The numbers denoted here by the symbol k,,,, are the well known Stirling 
numbers of the second kind. We may also write 


kmn=A"0"/m! (4.3) 


where the numbers denoted by the symbol A”0” are the so-called “differences 
of zero;” that is, A”0” is the mth leading difference of the nth power of the 
integers 0, 1, 2, 3, - --. Values of k,,,, for the nS10 are shown in Table 106 
|2, p. 212; 4, table 22; 7, table 49]. For more extensive tables, see Gupta [5] 
and Schafer [9]. Gupta’s tables are exact for m,n <50. Schiéfer’s tables extend 
to m 32 and n<100, to six significant figures. 

If n random selections are made with replacement from the integers 1 to N, 
inclusive, the probability that exactly s of these integers, after eliminating 
duplicates, will belong to a subclass of 7 integers may be written as 


T\(N -—T)! (") oo 
N*(T — 8)! mas (VN -T+s-—~m)! 


1 S (N - 
()z=G) 
N* \8/ sue \— 8 


TABLE 106 
VALUES OF P km. n FOR SMALL n 





P(s|n, T,N) = 
(4.4) 





n 





6 





0 
1 
31 
90 
15 
1 





SCOMN OAR WNe CO 


— 





where S is equal to n or N —7'+s, whichever is smaller. An alternative way of 
writing (4.4) is 


P(s|n, T, N) = mr ==(' Ja N — T)"'k,,. 


(4.5) 


1 
° = ( n) A'(N — T)", 


where k,,, is defined by (4.2). For these formulas, we adopt the usual conven- 
tions that 0! and 0” are equal to 1, and that 
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{" ifm =0 <n, 


- a 
1 ifm = 0 =n. 


(4.6) 

To derive (4.4), we observe that since the n selections are randomly selected 
with replacement from the integers 1 to N, the distribution of the m different 
numbers selected, after eliminating duplicates, between the target subclass of 
integers of size 7 and the subclass of all the remaining integers (of size N —T) 
is random. For a particular m, the conditional probability that s different inte- 


gers selected will belong to the target subclass and m—s will not is therefore 
of the hypergeometric form 


minrna(MCOD/C) an 


Formula (4.4) is obtained by multiplying (4.1) and (4.7) and summing the 
product over admissible values of m. 

To derive (4.5), we observe that if a sample of size n is selected with replace- 
ment from the integers 1 to N, the probability that t sample integers will belong 
to a target subclass of size T is of the binomial form 


| n\ ( T\'(N — T\"-* 
miarm (OCH as 


From (4.1), the conditional probability of finding s different integers among 
these ¢ sample integers is 
O(s| t, T) 4 k (4.9) 
§ cl a” Tippett ame Cs, . °¢ 
tT (T—s! 
The second member of formula (4.5) is obtained by multiplying (4.8) and (4.9) 
and summing over admissible values of ¢. 

The last member of (4.5), expressed in the notation of the calculus of finite 
differences, is a generalization of David’s [1] formula for the case where T = }N. 
It follows from the fact that the sth leading difference of the nth power of the 
integers M, M+1, M+2,---, may be written as a polynomial of degree 
n—s, thus, 


A'M" =a,M™"* + ayM™*'!+---+a,= DaM~ (4.10) 


t=s 


with coefficients 


n n 
aon ("a0 = (") ks 


In equation (4.5), M=N—T. 


5. Moments of the Distribution 


The moments of the probability distribution described by equations (4.4) 
can be derived as follows. 
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Since (4.5) describes a probability distribution, its sum over all admissible 
values of s must be equal to 1. We write 


1 u 
a (()aw-m=1, (5.1) 
N* ,0 \8 
where u is equal to n or 7’, whichever is smaller. This implies that for any fixed 
integers n, K, and H, and any variable s satisfying the inequalities, 
O0sSssn21, ssKazH, (5.2) 


we have the identity 
. £78 
= ( ) acc — K)* =1, (5.3) 
H* 6 4 8 
where u is now equal to n or K, whichever is smaller. 
Now put 
H=WN-—h, K=T-—h. (5.4) 
Then by replacing H and K with their equivalents, and multiplying both sides 
of (5.3) by the factor 
T\(N — h)” 
——_——— (5.5) 
(T — h)!N* 
we obtain an expression which may be written as 


1 


Lr nC) aa = 1 = 7 (——y 5.6 
Ne & | ih ( "= (T)r yw)? (5.6) 


where U is equal to 0 or 7’ —h, whichever is smaller, and the meaning of the 
expression (7’—s), and (7), is indicated by the equation 


(x), = a(x — 1)(x@ — 2)--- (a —A+4+1); (5.7) 


that is 


ita at" teh 
(ay = fe h) we Ss (5.8) 


0 for h > z. 
In particular, 
(T —s), = (T —8)(T—s—1)---(T—s—h+l). (5.9) 


Since equation (5.6) holds where the summation extends over integral values 
of s from 0 ton or (7 —h) whichever is smaller, and since (7 —s), is equal to 0 if 
s is greater than (7'—h), the equation must also hold where the summation ex- 
tends over integral values of s from 0 to u—that is, from 0 to n or 7’, whichever 
is smaller. Comparing (5.6) with (5.1), we see that the expected value of (7'—s), 
is 


‘N — h\” 
E(? — sy = (Ty (— ) (5.10) 
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or 





T! (Hay 
ifh < T, 
E(T —s), ={(T—h)!\ N 


0 ifh > T. 


(5.11) 


This is the general formula for the factorial moments of T’—s, the number 
of serial numbers belonging to the target subclass of size T that do not cor- 
respond with one or more of the n random numbers selected with replacement 
from the entire class of integers from 1 to N, inclusive. The expected values of 
the powers of s can be obtained by expanding the left side of (5.11) in powers 
of s, taking expected values term by term, and solving for Es, Es*, etc. In 
particular 


,« r(— “), (5.12) 


N — l\* N — 2\" 
- 7 - rear - 1) (——) +7(r -1)( y y, (5.13) 


reo (SYS CS) ox 


In actually computing Es and o,’, it may be convenient to employ the first 
few terms of the expansion 


CCQ OG-CG 


APPROXIMATIONS 
6. Poisson and Binomial Distributions 


In deciding how many random numbers to select in order to obtain a specified 
number of sample individuals after eliminating duplicates and individuals not 
in the target class, we need to know the risk of having too few or too many 
different individuals remaining after such elimination. The size cf this risk is 
provided by the cumulative probability distribution of s, the number of usable 
random numbers as defined in Section 4. Let us first consider the simple case 
where all individuals in a class or universe of size N belong to the target sub- 
class in which we are interested. For this case, the distribution of s is the same 
as the distribution of m as given by equation (4.1). 

Suppose, for example, that we want a sample of 9 different individuals from a 
universe of size 50. Then from the cumulative probability distribution of m, 
we find that if we select 10 random numbers with replacement, the risk of 
having fewer than 9 different sample individuals associated with these random 
numbers is less than .20, and the risk of having more than 9 is less than .32. 
We can thus be about 48 per cent sure of getting exactly 9 different individuals 
by selecting 10 random numbers, and 80 per cent sure of having at least 9 
different individuals. 
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Since the computation of the exact cumulative probability distribution of 
m, for large n and N, is tedious and time-consuming unless high-speed com- 
puters are employed, it is desirable to make use of approximations based on 
tables that are generally available. The tables of the cumulative binomial and 
Poisson distributions appear to be most suitable for this purpose. 

Several possible approximations suggest themselves. For example, it is evi- 
dent from equation (5.11) that if T= WN, and if N and n are large in comparison 
with h, then the factorial moments of N—m may be approximated from the 
relationship 


E(N — m), = N*e-™!" = a’, (6.1) 
where 
a = Ne™!%, (6.2) 
Since the factorial moments of the Poisson distribution 
e-*qN—m 


Pa - #10 = 


(6.3) 


are equal to a*, one might suppose that (6.3) would result in a good approxima- 
tion to the distribution of N —m. In fact, this approximation is suggested in one 
of the references [3, page 72]. It appears, however, that while the error in em- 
ploying (6.1) to approximate E(N —m), and E(N —m), is fairly small relative 
to the exact value, provided n and N are large in comparison with h, the error 
in computing the second and higher moments about the mean of the distribu- 
tion of m may be several times the exact value if N/n is large. Equation (6.3) 
therefore appears to be of limited value for approximating the probability dis- 
tribution of m, except for the approximation of its mean. 

For large m in the neighborhood of n, values of k»,, may be approximated by 
the following definite integrals. 

n? 


Retin =f zdz = —; 6.4) 
1 ' - ( 


n' 


n z2 
Kn-2.n =f f radadzr, = —- (6.5) 
zg 0 Y z= 8 


n ln 
han * f f 
Zn—m™@90Y ta- 
] 


z2 
f Ure *'* *Ln—mOIn-m** aX 
_ (6.6) 


m-1=0 
(=) 
a (n — m)!\2 
This approximation appears in one of the references [6, page 174]. We also 


note that for large N/m, equation (4.1) may be written in the approximate 
form 


O(m | n, VY) & N*"*k,..c. (6.7) 
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Combining (6.6) and (6.7) suggests the Poisson approximation, 
e ohn” 


I he ay! 


(6.8) 
where 
b = n?/(2N). (6.9) 


If N/n? is large, the value of b in (6.9) is of the same order of magnitude as 


1\* 
E(n — -N+N({1-— 
pom nn-wan(-4) 


n—1 


4s). pte 
N 


_ n(n — 1) n(n — 1)\(n-—2 


2N 6N? 


Eu) Gy) 


This fact suggests the Poisson approximation 





—Cav—m 


Py + 810 


where v is equal to n or N, whichever is smaller, and 
c = E(v — m). (6.12) 
Another possibility suggested by (6.2) and (6.12) is 


e~4qr—m 
Pav — m| d) = — 


(v—m)!. 


(6.13) 


where 
=y—N+Ne-%, (6.14) 


The approximations are the same as for (6.3) if n=N. 

The fact that the binomial and Poisson distributions converge to the same 
limiting values suggests using some form of the binomial. If we wish this distri- 
bution to have the same number of terms as the exact distribution, we can 
employ 


—1 
Bi(m — 1| », p) = (’ Joma =p", (6.15) 
m—1 


where v means the same as in (6.11), and 


Em —1 
p=- . (6.16) 
v—1 
If we wish to have a binomial distribution with the same mean and approxi- 
mately the same variance as the exact distribution of m, without specifying the 
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number of terms otherwise, we can proceed as follows. We note the fact that for 
any binomial distribution of the frm 


B(e| +, p) = (") p(1 — py, (6.17) 


we have the relationships 
(Ez)? 


Ex = oz” 


, 


We therefore put 


and compute a ratio 


v— Em 
j= ; (6.22) 
v — Em — on‘ 





where Em and a,,? are found by substituting N for T in equations (5.12) and 
(5.14). We round this value of 7 to the nearest integer, which we denote by the 
letter r, and compute p by employing (6.21). A binomial approximation to the 
distribution of m can then be written in the form 


» 
B(v — m|r, p) = ae 
0 im<v-r 


—m(l1—p) ' ify—-rom sy», 
)e shits ? ”* (6.23) 


where 


vy — Em 
p= . (6.24) 
, 


If N is not too small, and if n<N, the value of r computed from equation 
(6.22) will closely agree with its limiting value, 
3n(n — 0) . 3n 


Lin ¢ 


yw0s—(i — 1) oe) 


In other words, the probability distribution of m has about the same variance 
as the binomial distribution with the same mean and 3 as many terms. 

Table 113 compares these various approximations with the actual distribu- 
tion of m, on a cumulative basis, for n equal to 5 and 10, and N =5, 10, 20, 50, 
and 100. The comparison seems to indicate that one of the binomial approxi- 
mations usually yields the most satisfactory approximation. The one based on 
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TABLE 113 


COMPARISON OF CUMULATIVE PROBABILITY DISTRIBUTIONS FOR ESTI- 
MATING THE NUMBER OF DIFFERENT INDIVIDUALS 
IN A SAMPLE WITH REPLACEMENT 








Binomial 


Poisson Approximations Approximations 


Population size No. of 

(N) and sample different 
size (n) individuals Formula Formula Formula | Formula Formula 

(6.8) (6.11) (6.13) (6.15) (6.23) 








- 10882 -02586 -03932 -00000 
-24242 -08419 -11512 -02815 
-45619 -22661 -27995 - 19044 
-71270 -48738 .54878 -54132 
-91792 -80571° -84109 -87850 


N=5 
n=5 
Em =3.3616 


arond rf Oo 


-08849 -00912 -00240 -00476 -00000 
- 15958 -03827 A -02326 
- 26506 -13153 -06367 -09270 
-40419 -35536 : - 28825 
-56475 -71350 ‘ -65538 
-72359 -00000 -00000 -00000 


N=10 
n= 5 
Em= 4.0951 


ar wd oO 


15933 -00047 m -00033 
+ 22345 -00388 : -00291 
-30166 -02566 
-39205 -13020 
-49070 -46474 
-59204 -00000 


N =20 
n= 5 
Em= 4.52438125 


arkwwr co 


- 25850 -00001 
-30738 
-36031 
-41648 -02650 
-47482 
-53414 -00000 


N =50 
n= 65 
Em= 4.80396016 


arawondrK © 


-32190 -00000 
-35896 -00001 
-39753 -00030 
-43726 
-47778 
.51867 -00000 


ar ends & © 


-01249 
-04861 
- 15535 
-39196 -99950 
-74163 


N= 5 
n=10 
Em= 4.463129088) 


arwnd re © 


-00000 -00466 -03183 
-00000 -01325 -06809 
-00000 -03426 - 13337 
-00068 -07995 - 23782 
-01787 - 16689 . 38404 
- 14646 -30869 -55951 
.49161 -50141 - 73497 
-84723 -71096 -87535 
-98331 -88184 -95957 
-99964 -97475 -99326 
1.00000 | 1.00000 1.00000 


N =10 
n=10 
Em =6.513215599 


CONOR WNe OC 


_ 
o 

















* Most entries in the table are rounded to the last decimal place. 


(Continued on next page) 
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TABLE 113—Continued 








Binomial 


Poisson Approximations Approximations 


Population sise No. of 

(N) and sample different 
size (n) individuals Formula Formula Formula Formula | Formula Formula 

(6.3) (6.8) (6.11) (6.13) (6.15) (6.23) 








-02347 -00028 -00004 -00008 -00000 
-04086 -00114 -00022 -00037 
-06811 -00425 -00101 -00162 
- 10854 -01419 -00424 -00632 
- 16519 -04202 -01567 -02175 
- 23992 - 10882 -05040 -06521 
- 33233 - 24242 - 13835 - 16718 
-43897 -45619 -31649 -35863 
- 55326 -71270 -58711 -62819 
-66631 -91792 ‘ -88124 
- 76883 -00000 -00000 -00000 


N =20 
n=10 
Em =8 .0252611215* 


CSCC MN Qaranre oO 


= 


-09330 -00000 -00000 -00000 
- 12027 -00000 -00000 -00000 
- 15255 é -00000 -00001 
- 19040 d -00003 -00006 
- 23385 é -00026 -00042 
- 28268 
-33635 
- 39405 
-45465 
- 51682 
. 57909 


N =50 
n=10 
Em =9 .146359656- 


SCC On Oar wWNK OC 


~ 


- 17106 
- 19822 
- 22793 
-26011 
- 29461 
-33121 
- 36964 
-40957 
-45060 
-49232 
-53428 


N =100 
n=10 
Em =9 .561792499* 


0 
1 
2 
3 
4 
5 
6 
7 
8 
9 
0 


-_ 

















formula (6.23) seems to be about the best for most purposes. Incidentally, with 
two exceptions, formula (6.25) leads to the same value of r as formula (6.22), 
after rounding to the nearest integer, for each of the combinations of n and N 
shown in the table. The exceptions are for (V=5, n=10) and for (V=10, 
n=10). 

Among Poisson approximations, the one based on formula (6.11) appears to 
be most satisfactory for assessing the risk of too few different random numbers. 
Formulas (6.13) and (6.14) suggest a connection with the theory of extreme 
values. 

Let us now consider approximations to the distribution of s, the number of 
different sample individuals that belong to a target subclass, and hence, the 
number of usable random numbers. It can be shown that if random selections 
from some population or class have a binomial distribution, then the selections 
belonging to a particular subclass will also have a binomial distribution. Since 
the results already obtained suggest that the distribution of m can usually be 
approximated satisfactorily by a binomial distribution, we infer that the distri- 
bution of s may also be approximated by a distribution of this form. 
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Suppose we employ the approach used in developing formula (6.23), and first 
compute a ratio 
7 (r — Es)? 
ts = ——____—_ (6.26) 


t — Es — a,” 


where 7 is equal to n or T, whichever is smaller, and Es and oa,’ are computed 
from formulas (5.12) and (5.14). We can then write a binomial approximation 
to the distribution of s as follows: 


r* 
—(1l1-prr ifr-—resesr 
B;(r — s| r*, p) = ("je " 
0 ifs<7r—re, 
where r* is obtained by rounding 7* to the nearest integer, and 
1— Es 


re 


p= (6.28) 
The limiting value of #*, as N->@, turns out to be equal to n, except for the 
special case where 7’=N, in which case the limit is given by (6.25). Values of 
7*, before rounding off to the nearest integer, for selected values of N, n, and 
T/N, are shown in Table 115. 


TABLE 115 
COMPUTED VALUES OF 7r« 


T/N =1.0 T/N =0.8 T/N =0.6 











2.3773 .1889 .9335 
.1741 .0786 7552 
-0835 -4089 .1014 
-0325 -9938 .5238 
-0161 -3811 . 7342 


-7229 -6484 .5376 
-8748 -4011 -7877 
3495 1328 .4748 
.0490 .9624 .3347 
-9522 .8453 .9642 





For large samples, where n<7'<N, it should usually be satisfactory to em- 
ploy the Poisson distribution 
e~1\n-8 
P,(n — 8|) = — , (6.29) 
(n — 8)! 


where 


N — 1\" 
N=n-Bs=n-747( —)- 


4 
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Table 116 shows comparisons of approximations based on formulas (6.27) 
and (6.29) with the actual cumulative probability distributions for selected 
values of N, n, and T/N. 

Schafer and Weiss discuss the conditions for asymptotic normality of the 
occupancy distribution, equation (4.1) [9, 11]. 


TABLE 116 
COMPARISON OF CUMULATIVE PROBABILITY DISTRIBUTIONS FOR ESTI- 
MATING THE NUMBER OF USABLE RANDOM 
NUMBERS FOR A PARTICULAR SAMPLE 

Population No. of T/N =0.8 T/N =0.6 
size (N) usable 
and sample random Actual dis- Binomial Poisson Actual dis- Binomial Poisson 
size (n) numbers tribution approx. approx. tribution approx. approx. 














-00032 .00000 .08501 .01024 .00000 .18189 
.04000 .00000 . .20284 -21280 .24159 .34897 
-38560 .42950 .40680 -76000 .74145 .57300 
-88480 .88122 .67161 1,00000 1.00000 .79830 
-00000 1.00000 .90081 1.00000 1.00000 .94936 
-00000 1.00000 1.00000 1.00000 1.00000 1.00000 


-00000 .03116 .01024 .00000 .11463 
-00000 .09680 -13630—zj. . 25164 
-18975  .24910 .51880 .53772 .46715 
61113 =.51415 -88480 -72140 
-92304 .82163 .99280 -92137 
-00000 1.00000 1.00000 1.00000 1.00000 


-00000 .01350 -01024 .00000 .08205 
00000 .05155 10879 . 19769 
-16181 -41136—yj. -40008 
.40141 .78468 x -66576 
- 74855 -97030 -89826 
.00000 1.00000 1. -00000 


.00668 .01024 .06389 
-03015 09511 —« . 16470 
-11129 -35323 .35513 
.32172 -71376 .62490 
-68552 -94528 -87968 
-00000 1.00000 1. -00000 


-00502 -01024 -05817 
-02423 09098. - 15375 
-09543 -33503 tj. -33940 
- 29335 -68869 x -60983 
-66013 -93446 -87247 
-00000 1.00000 1. -00000 


-01545 00010. -08636 
-05711 -01793—yj. -20517 
- 17367 -30409 ij. -40982 
-41831 1.00000 1. -67422 
- 76057 1.00000 1. -90193 
-00000 1.00000 1. -00000 


0 
1 
2 
3 
4 
5 
0 
1 
2 
3 
4 
5 
0 
1 
2 
3 
4 
5 
0 
1 
2 
3 
4 
5 
0 
1 
2 
3 
4 
5 
0 
1 
2 
3 
4 
5 
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TABLE 116 (continued) 








Population No. of T/N=0.8 T/N =0.6 
size (N) usable 
and sample random Actual dis- Binomial Poisson Actual dis- Binomial Poisson 
size (n) numbers tribution approx. approx. tribution approx. approx. 








-00000 .00000 .02482 , -00000 .09041 
-00005 .00000 .05521 ‘ -00000 =.1624i 
.00265 .00000 .11232 : : - 26879 
.04071 .00000 .20772 : : -40848 
.23293 .23649 .34715 , 3 -56899 
-61154 .64703 .52182 ‘ . -72708 
.90819 .91429 .70416 . - Of -85682 
-99335 -85646 , , -94202 
-00000 1. -95185 ‘ : -98397 
-00000 1. -99168 : 5 -99774 
-00000 1. -00000 : : -00000 


— 
ovocmn our ON K © 


00000. -00387 
-00001__—«. -01129 
00051 —yj. .02994 
.00758 xj. -07162 
-15311 
- 28970 
-48048 
-69366 
-87231 
-97212 
-00000 


0 
1 
2 
3 
4 
5 
6 
7 
8 
9 
0 


1 


-00048 
-00183 
-00639 
.01996 
.05537 
- 13456 
- 28214 
-50218 
- 74822 
-93164 
-00000 
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PRACTICAL APPLICATIONS 
7. Illustrative Examples 


The formulas obtained in the preceding sections are particularly useful in 
practical problems that involve questions of the following types: 

1. How many random ‘:umbers do we need to select initially so that we may 
expect to have S usable numbers (that is, S different numbers remaining after 
eliminating duplicates and numbers that do not correspond to the serial num- 
bers of the class of individuals of interest)? 

2. How many random numbers do we need to select initially to be reasonably 
sure of having at least S usable numbers? 

3. How many random numbers do we need to select initially to be reason- 
ably sure of having no more than 5 usable numbers? 

Let us consider each of these three questions in connection with the example 
discussed in Section 1. 

Question 1. Suppose a population of size 18,000, with serial numbers from 1 
to 18,000, includes a target class of 15,965 individuals in which we are inter- 
ested. Suppose we wish to select enough 5-digit random numbers from 00000 
to 19999 (as outlined in Section 1) so that we may expect to have 800 different 
numbers corresponding to serial numbers of the target class. How many random 
numbers shall we select? 

For this problem, put 


N = 20,000, T = 15,965, R = T/N = 0.79825. (7.1) 
We are to find the value of n such that 
S = Es = 800. (7.2) 
Employing equations (5.12) and (5.15), we obtain the approximation 


n-1 
Es = Rn E _ | (7.3) 
2N 


Solving for n yields 


n=N+4-—3-/(2N + 1)? — 8N(Bs)/R. (7.4) 





Substituting equivalent values from (7.1) and (7.2) in (7.4) , we get 


n = 20,000.5 — } (40,001)? — 8(20,000) (800) /0.79825 
= 1028.6 = 1029, approximately. 





-o) 


The result indicates that if we initially select about 1029 random numbers 
between 1 and 20,000, we can expect to get about 800 different numbers that 
correspond to serial numbers belonging to the target class of size 15,965. 

Question 2. Suppose a population of size 18,000, with serial numbers from 
1 to 18,000, includes a target class of 15,965 individuals in which we are inter- 
ested. Suppose we wish to select enough 5-digit random numbers from 1 to 
20,000 so that we may be 95 per cent sure of having at least 800 different num- 
bers corresponding to serial numbers of the target class. How many random 
numbers shall we select? 
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Clearly, the number to be initially selected must be somewhat in excess of 
the 1029 obtained in (7.5) as the number required so that the expected number 
of different usable random numbers will be almost exactly 800. This is suf- 
ficiently large so that the distribution of s, the number of usable random num- 
bers, will very likely approximate the normal distribution closely enough for 
our immediate purpose. The question to be answered can therefore be restated 
about as follows. How many random numbers do we need to select so that 


Es — ko, = S (7.6) 


where S is the desired minimum number of usable random numbers (in this 
case, 800), and k is a factor such that the probability of s2S is about 95 
per cent? 

From a table of the normal distribution [4, table], we see that for 
k= 1.644854, the related probability is 0.1. This means there is a 10 per cent 
chance that a normal deviate will differ from its mean by more than 1.644854 
standard deviations; and that the probability is } of 10 per cent, or 5 per cent, 
that it will differ from its mean by more than 1.644854 standard deviations in 
a negative direction. 

Hence, if we put 


k = 1.644854, (7.7) 


and solve equation (7.6) for n, we can be about 95 per cent sure that the numbeT 
of usable random numbers will exceed the specified number S, provided this 
value is sufficiently large so that the distribution of the actual number of usable 
numbers is approximately normal. 

Let n denote the value of n that satisfies equation (7.6). This value can be 
approximated by some iterative method. 

As a first approximation, let us put 


1 aa 
mn eo E+ by'Q). 


For the example considered here, 
R = 0.79825, S = 800, 


and k is given by (7.7). Substituting these values in (7.8), we get 


= —_—— (800 + 1.644854+/800) = 1060. 7.10 
my 0.79825 ( + oa VA ) ( ) 


To see how good this approximation is, we substitute this value of n in equa- 
tions (5.12) and (5.14), obtaining with the aid of (5.15), 





Es, = (0.79825)(1060) 


T, _ 1060-1 , (1060 — 1)(1060 — 4 
L” ~ 2(10,000) 6(20,000)2 


(7.11) 
= 824.1, 
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[4(0.79825) — 3](1060 — 1) 
2(20,000) 
r [3(0.79825) {4(1060) — 7} — 7(1060 — 2)](1060 — ») (7.12) 
6(20,000)? 





04," = (0.79825) (1060) [: — 0.79825 + 





= 174, 
oe, = V174 = 13.2. (7.13) 


We now substitute values from (7.11) and (7.13) in (7.6), obtaining 
S; = 824.1 — (1.644854) (13.2) = 802.4. (7.14) 


Hence, if we initially select 1060 random numbers, we can be about 95 per cent 
sure of obtaining 802 usable random numbers—that is, 802 different random 
numbers corresponding to serial numbers in the target class. 

If we wish to improve on the first approximation to n, we can take 


(- ) (1060) = 1057 (7.15) 
™m = — . 
' 802.4 

as a second approximation to the value of n that satisfies (7.6). To test this 
approximation, we compute 


Es, = 821.86, (7.16) 
o,2 173.5, (7.17) 
4, = V173.5 = 13.17, (7.18) 
S: = 821.86 — (1.644854)(13.17) = 800.20. (7.19) 


This result indicates that 1057 is the integral value of n that comes closest to 
satisfying equation (7.6). 

Because of the various approximations involved, including the use of the 
normal distribution as an approximation to the distribution of the number of 
usable numbers, there may be some question as to whether we are justified 
in this particular example, in assuming that n is better than m, as the sample 
size required to be 95 per cent sure of having 800 usable random numbers. 

Question 3. Suppose a population of size 18,000, with serial numbers from 1 
to 18,000, includes a target class of 15,965 individuals in which we are interested. 
Suppose we wish to select as many 5-digit random numbers from 1 to 20,000 
as we can and still be 95 per cent sure of having no more than 800 usable num- 
bers in the target class. How many random numbers shall we select? 

Since the sample is fairly large, we can proceed in a manner similar to that 
discussed in connection with Question 2. Referring to a table of the normal 
distribution, we choose 


k = 1.644854, (7.20) 


as in (7.7), since the probability is 5 per cent that a normal deviate will exceed 
its mean by more than 1.644854 standard deviations. We then find the value of 
n that satisfies the equation 
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Es + ke, = S, (7.21) 


where S is the specified maximum number of usable random numbers—in this 
particular problem, 800. 
As a first approximation to this value of n, we put 


i = : (S — kVS). (7.22) 


Substituting 0.79825 for R and 800 for 5, we get 
fy = 944. (7.23) 


To see how good this approximation is, we substitute this value for n in 
(5.12) and (5.14), obtaining 


Ex, = 736.1, (7.24) 
o;, = 12.44. (7.25) 

Substituting these values in (7.21) we get 
Si = 736.1 + (1.644854)(12.44) = 756.6. (7.26) 


Hence, if we initially select 944 random numbers, we can be about 95 per cent 
sure of having no more than 757 usable random numbers—that is, 757 different 
random numbers that correspond to serial numbers in the target class. 

In this case, 7%, is not a very good approximation to the solution of equation 
(7.21). To get a better approximation, we take 


800 
fg = 944 a) = 998 (7.27) 
756.6 
as a second approximation. Substituting this value for n in (5.12) and (5.14) 


we get 
E3, = 777.1, (7.28) 
o5, = 12.795. (7.29) 
Substituting these values in (7.21) yields 
S: = 798.1. (7.30) 


To obtain a third approximation, we compute 


800 
fis = 998 (——,) = 1000.5, (7.31) 
798.1 


which indicates that we should select 1000 random numbers initially if we wish 
to have as many as we can and still be about 95 per cent sure of having no more 
than 800 different random numbers that belong to the target class. 
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REFERENCES 


Our earlier discussion of measures of association for cross classifica- 
tions [66] is extended in two ways. First, a number of supplementary 
remarks to [66] are made, including the presentation of some new meas- 
ures. Second, historical and bibliographical material beyond that in [66] 
is critically surveyed; this includes discussion of early work in America 
by Doolittle and Peirce, early work in Europe by Kérésy, Benini, Lipps, 
Deuchler and Gini, more recent work based on Shannon-Wiener in- 
formation, association measures based on latent structure, and relevant 
material in the literatures of meteorology, ecology, sociology, and an- 
thropology. New expressions are given for some of the earlier measures 
of association. 


1, INTRODUCTION AND SUMMARY 


His paper has two purposes. First, we wish to present a supplementary dis- 
to problems considered in our first paper on cross classifications 
[66], including presentation of some new measures; this is Section 2 of the 
present paper. Second, we wish to extend the brief historical and bibliographi- 
cal remarks in [66] to include a number of publications, many of them little- 
known, that may be of interest to those working with cross classifications; this 
is done in Sections 3 and 4 of the present paper. 

We have in preparation a paper on approximate distributions for the sample 
analogues of the measures of association described in [66], but it seems de- 
sirable to bring the present remarks, virtually none of which deal with sam- 
pling distributions, to the reader’s attention in a separate report. 

The literature on measures of association for cross classifications is vast, it is 
poorly integrated, and seldom in this literature are meaningful interpretations 
of measures adduced. One finds the same questions discussed in papers on 
meteorology, anthropology, ecology, sociology, etc. with hardly any cross 
references and with considerable duplication. In surveying this literature, we 
have been selective, although the length of this paper may not suggest it. 
Discussion of a measure of association here does not mean ipso facto that it 
has an operational interpretation, a very desirable characteristic for which we 
argued in [66], but may simply reflect some other interesting aspect of the 
measure, for example its historical role. 

One may organize the historical and bibliographical material in various ways, 
classifying by date, by type of measure, by substantive field, and so on. We 
have used a gross chronological division, but within it we have classified in 
several ways, as seemed most appropriate. Material from [66] has not been 
repeated here. 
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2. SUPPLEMENTARY DISCUSSION TO PRIOR PAPER 


2.1. Cross classifications in which the diagonal is not of interest. Herbert Gold- 
hamer (Rand Corporation) has been concerned with measuring association for 
a Xa tables where the classes are the same for the two polytomies, as in Section 
8 of [66], but where the diagonal entries are of little or no interest. For exam- 
ple, one might tabulate occupation of father against occupation of son, and 
investigate the association between the two occupations only in the off-diagonal 
subpopulation where they are not the same. Thus the situation, while similar 
to that of reliability measures, as in Section 8 of [66], differs from it in that the 
diagonal entries must not play a part; and hence d, of [66] would not be suita- 
ble. 

It seems to us that reasonable measures of association in this situation would 
be provided by Aq, As, or A in the unordered case, and by ¥ in the ordered case, 
when these measures are applied to the conditional classification with all 
Poa =0. Hence, replacing pas, for a~b, by par/(1— >>paa), and taking all ps.=0, 
we would get a new table for which the \’s or y would have direct conditional 
interpretations. This kind of simple modification is often easy to make for 
measures with operational interpretations, whereas it is not at all clear how 
one might usefully alter a chi-square-like measure to fit Goldhamer’s problem. 
A similar point is made in another context in Section 4.13. 

2.2. A relation between the \ measures and Yule’s Y. Suppose that in the 
2X2 case we make a transformation of form pas—Salspa» SO that all the marginals 
become .5 [66, Sec. 5]. Then, for the altered table, Aa=»=A, and all three are 
equal to the absolute value of Y, where 


Vv P11p22 — V Pi2P21 
Vv. Pip22 + V P12P21 


as described in Section 4 of [66]. The actual transformation is that for which 








(p’s of original table) 





(81: So: ty: te) = (WV perpee: V pipi2: V/ pi2p22: VV pirpa1). 


Thus we have another formal identity in the 2X2 case between a classical meas- 
ure of association and one with an operational interpretation. 

2.3. Symmetrical variant of proportional prediction. In Section 9 of [66], we 
mentioned a measure of association based, not on optimal prediction, but on 
proportional prediction in a manner there explained. If one predicts polytomy 
B half the time and polytomy A the other half, always using proportional 
prediction, then the relative decrease in the proportion of incorrect predictions, 
as one goes from the nothing-given situation to the other-polytomy-category- 
given situation, is 


1 
2 »» u { (pas — pa-P-»)?(pa. + p.»)/(pa-p-s) } 





lo Bat + — Rew 


2 a 2 b 
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In the 2X2 case this quantity, together with the asymmetrical 7, of [66], re- 
duces to : 


(pi1p22 — pip)? 





, 


P1-P2-P-1P-2 


or $”, the mean square contingency. 

2.4. Association with a particular set of categories. In Section 10 of [66], we 
described a simple way to consider association between a particular A category 
and the B polytomy; namely coalescence of the aX table into a 2X8 table 
whose rows correspond to the particular A category and its negation respec- 
tively. A similar suggestion was made by Karl Pearson in 1906 [112]. 

We now discuss association between a particular set of A categories and the 
B polytomy. Suppose that we want to consider the association between 
Ag,, Aa, * * * , Aa,, & specific set of A classes, and the B polytomy. One possible 
approach is to condense all the A, rows that are not in the specific set of A 
classes (7.e., all the A, rows where a is not equal to any a,, kK=1, 2,---, 8) 
into a single row, thus obtaining an (s+1)X8 table, and then apply whatever 
measure of association is thought appropriate. This approach might be used 
if the entire original population is of interest, and we are only concerned with 
association for the specific set of A categories and their pooled remainder. If, 
however, the population of interest consists only of those individuals who are 
in the specific set of A categories, A,,, Aa,,---+, Ac,, (s>2), then we would 
apply whatever measures of association are thought appropriate (e.g., Aa, A», A, 
y, etc.) to the conditional classification with p.,=0 for all a that are not equal 
to any a, k=0,1, - - - , s. That is, we would delete all rows except those cor- 
responding to A,,,---, Ae, and in those rows we would replace p» by 
pas/ >t-1 >.r=1 Pays. We would then have an sX@ table, and the X’s or 
would have direct conditional interpretations. 

The association between a particular set of A categories and a particular 
set of B categories, or a particular set of combined (grouped) A categories and 
a particular set of combined B categories, can be treated in an analogous man- 
ner. 

2.5. Comparison of degrees of association exhibited by two cross classifications. 
Sometimes one wishes to compare the degrees of associetion shown by two 
cross-classified populations. This question is particularly likely to arise when 
the two classifications are the same for both populations. It was discussed 
briefly on page 740 of [66]; a bit more detail may be of interest here. 

Suppose, for example, that we are considering two populations, each cross 
classified by the same pair of polytomies and such that \, is the appropriate 
measure of association. That is, the relative decrease in probability of error 
for optimum prediction of column, as we go from the case of row unknown to 
that of row known, is the relevant population characteristic. Then the differ- 
ence between the d,’s of the two populations gives a simple comparison with a 
clear meaning. Sometimes the relative difference between the dy’s might be of 
more interest. 
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If the pairs of classifications for the two populations are not identical, as 
will necessarily be the case when the two cross classification tables are of 
different sizes, the purpose of comparison may not be clear. However, the 
absolute or relative differences described above may still be used and have per- 
fectly definite interpretations. Of course, the above comments are applicabie, 
not only to \», but to any other measure of association that has an operational 
meaning. 

When we are concerned with sampling problems, the question may arise 
whether two sample values of \, (say) from two different populations differ 
with statistical significance. This question, together with other questions relat- 
ing to sampling, will be considered in a paper now in preparation. In that paper 
we shall also discuss the question of whether K sample values of \, from K 
different populations (K >2) differ with statistical significance. 

2.6. A new measure of association in the latent structure context. Several meas- 
ures of association discussed in Sections 3 and 4 are based upon probabilistic 
models of a latent structure nature. This kind of model is explained and dis- 
cussed in Section 4.9, and there we suggest a new measure in addition to those 
already suggested by others. 

2.7. Two corrections. The second and third sentences of the second paragraph 
of [66], p. 758, are essentially correct, but may be misleading. It would have 
been clearer to have written 


It [A,] takes the value —1 if and only if (i) all pa’s not in the row or column of the 
modal class are zero, and (ii) paa for the modal class is not one. It takes the value 1 if 
and only if (i) Zpaa=1 (i.e. the two methods always agree), and (ii) paa for the modal 
class is not one. 


Formula (6) on p. 740 of [66] should have contained a radical in the de- 
nominator, so that the correct formula is 





T = V[x2/v]/V(a — DG — D. 


We thank Vernon Davies (Washington State) for calling this to our attention, 
and we apologize to him and to other readers for an erroneous corrigendum 
about this point on page 578 of the December 1957 issue of this Journal, 
in which a solidus was missing before the inner radical. 


3. WORK ON MEASURES OF ASSOCIATION IN THE LATE NINETEENTH AND 
EARLY TWENTIETH CENTURIES 


3.1. Doolittle, Peirce, and contemporary Americans; Képpen. In the 1880’s, 
interest arose in American scientific circles regarding measures of association. 
Such eminent men as M. H. Doolittle, of Doolittle’s method, and C. 8. Peirce, 
the well-known logician and philosopher, took part in the discussion. 

Apparently it began with the publication [47] by J. P. Finley, Sergeant, 
Signal Corps, U.S.A., of his results in attempting to predict tornadoes. During 
four months of 1884, Finley predicted whether or not one or more tornadoes 
would occur in each of eighteen areas of the United States. The predictions 
generally covered certain eight-hour periods of the day. One of Finley’s sum- 
mary tables is given below as an example. 
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COMPARISON OF FINLEY TORNADO PREDICTIONS AND 


OCCURRENCES, APRIL, 1884. SOURCE: [47, p. 86] 


TABLE SHOWS FREQUENCIES OF TIME PERIOD—GEOGRAPHICAL AREA 
COMBINATIONS IN EACH CELL 


Prediction 


Occurrence 





Tornado 


No Tornado 








Tornado 


11 


14 


25 








No Tornado 


3 


906 


909 





Totals 


14 


920 


934 




















Thus, for example, in 14 out of the 934 time period-geoegraphical area com- 
binations considered, one or more tornadoes occurred; out of these 14, Finley 
predicted 11. Since Finley’s predictions were correct in 917 out of 934 cases 
he gave himself a percentage score of 100 (917/934) =98.18 per cent.* Thus he 
used the diagonal sum mentioned in Section 8 of [66]. 

This score, as a measure of association between prediction and occurrence, 
is wholly inappropriate for Finley’s study. A completely ignorant person could 
always predict “No Tornado” and easily attain scores equal to or greater than 
Finley’s; in the above example, always predicting “No Tornado” would give 
rise to a score of 100(920/934) = 98.50 per cent. (Of course it is clear that Finley 
did appreciably better than chance; the question is that of measuring his skill 
by a single number.) 

It was not long before Finley was taken to task. G. K. Gilbert [55] pointed 
out the fallacy and suggested another procedure, prefacing his suggestion, with 
commendable humility, in the following words: 

“It is easier to point out an error than to enunciate the truth; and in matters in- 
volving the theory of probabilities the wisest are apt to go astray. The following sub- 


stitute for Mr. Finley’s analysis is therefore offered with great diffidence, and subject 
to correction by competent mathematicians.” [55, p. 167] 


If Finley’s table is written in terms of proportions rather than frequencies, 
and in the notation of [66], it is of form 


Occurrence 


Prediction 





Tornado 





No Tornado 





Tornado 


Pil 


P12 





No Tornado 


P21 


p22 





Total 








P+ 





P-2 











Gilbert suggests that a sensible index of prediction success would be the 


quantity 





* Finley actually obtai 
scores. For April the average was 98.51 per cent. 





d such per 


ge scores for each geographical area separately and then averaged the 
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Pil — P1-P-1 





, 
pi. + pa — pu — pi-p-1 
and he lists a number of formal properties that this index has. For example, 
it is <1; it is zero when pu =p1.p.1; it has desirable monotonicities; etc. Finally 
Gilbert mentions the difficulties of extending his index to prediction problems 
- with more than two alternatives. H. A. Hazen [77] criticized Gilbert’s paper, 
and suggested an alternative index of predictive success based upon a weighted 
scoring scheme that gave decreasing credit to occurring tornadoes as they fell 
further from the center of the predicted region. 
In the same year that Gilbert’s paper appeared, C. S. Peirce [115] suggested 
a much less ad hoc index of prediction success. Peirce pointed out that one 
could think of the observed results as obtained by using an infallible predictor 
a proportion @ of the time, and a completely ignorant predictor the remaining 
proportion 1 —86 of the time. The infallible predictor predicts “Tornado” if and 
only if a tornado will occur. The ignorant predictor uses an extraneous chance 
device that precicts “Tornado” with frequency y and “No Tornado” with fre- 
quency 1—y. Thus what we are asked to contemplate is a mixture of the two 
2X2 sets of probabilities 








paw | pay 




















pu(l—y) | p.a(1—y) 








with weights 6 and 1—@ respectively. The meanings of the four cells in these 
tables are the same as in the preceding tables. 
The mixed table is, therefore, 





Op.1+(1—@)pay (1 —@)p.2w 





(1 —@)p.(1—y) 6p -2+(1—8)p.2(1 —y) 














and Peirce inquires what values of @ and y will reproduce the actually observed 
2X2 table. (Note that for any 6 and y the column marginals of the mixed table 
are p.; and p.2.) 

For this approach to make sense, @ and y must be uniquely defined in terms 
of the actual p,» table. From the (1, 2) cell, we require 


(1 — 0)~ = pi2/p-2, 


whence from the (1, 1) cell 





P1p22 — Pi2P21 Pil — P1-P-1 


6 


P.1P-2 P-1p-2 


P12P-1 
pi2p.1 + prip.2 
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Substitution shows that these values form a unique solution. The only difficulty 
occurs when @ is negative, for then it can scarcely be a probability. 6 itself is 
suggested as the measure of association in the sense of prediction success. 


Note that 
a} Pu Pi2 


, 
P-1 p.2 


or the difference between the conditional columnwise probabilities of a tornado 
prediction. 

If @=1, prediction is considered as good as possible, since. it is equivalent to 
infallible prediction. If 6=0, prediction is as poor as it can be without being 
perverse, since it is equivalent to randomized prediction using the row marginal 
frequencies of the table under investigation; that is, it corresponds to inde- 
pendence. Further, the @ that makes 0p.1:+(1—@)p..~ equal to px has an opera- 
tional interpretation in terms of a hypothetical, if perhaps far-fetched, model 
of activity. As @ increases, prediction improves. 

This proposal by Peirce is of a kind that may be called latent structure 
measures. We discuss this kind of measure later on in Section 4.9. Peirce’s 
measure, 6, was independently proposed and differently motivated by W. J. 
Youden in 1950 [66, p. 745, footnote]. 

Peirce mentions the extension of his approach to larger tables but gives no 
details. He concludes by suggesting another index that takes into account the 
“profit, or saving, from predicting a tornado, and . . . the loss from every un- 
fulfilled prediction of a tornado (outlay in preparing for it, etc.).... 
Peirce, writing in 1884, is the first person of whom we know to discuss the 
measure of association problem with the intent of giving operationally mean- 
ingful measures. Of course, further study might bring earlier proposals to light. 

Very soon after Peirce’s letter appeared, M. H. Doolittle [35] discussed the 
topic at the December 3, 1884, meeting of the Mathematical Section of the 
Philosophical Society of Washington. Doolittle argued for a symmetrized ver- 
sion of Peirce’s index, suggesting on rather ad hoc grounds the product of the 
two possible asymmetrical Peirce quantities 


P11P22 — P12P21 P11P22 — P12P21 
ae entienneeinns raieaaasiaemeiitaaeie’-s 


p.1p-2 P1-P2- 


This product is simply the mean square contingency, and may be the first 
occurrence of this chi-square-like index. Doolittle also alluded to the difficulty 
of extending such measures beyond the 2X2 case. 

At a subsequent meeting of the Mathematical Section (February 16, 1887), 
Doolittle [36] continued his discussion in more general terms than those of 
measures of prediction success alone. His discussion is similar at points to that 
of Yule’s 1900 paper [149] and he attempts to develop a rationale for the 
quantity we call the mean square contingency; Doolittle called it the discrim- 
inate association ratio. At a third meeting (May 25, 1887), Doolittle [36] con- 
cluded his discussion with a criticism of Gilbert’s criticism of Finley. 

We cannot forbear presenting a quotation from Doolittle in which he strug- 
gles to state verbally the general approach he favors. 
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“The general problem may be stated as follows: Having given the number of in- 
stances respectively in which things are both thus and so, in which they are thus but 
not so, in which they are so but not thus, and in which they are neither thus nor so, 
it is required to eliminate the general quantitative relativity inhering in the mere 
thingness of the things, and to determine the special quantitative relativity subsist- 
ing between the thusness and the soness of the things.” (36, p. 85] 


What is a reasonable measure of prediction success for Finley’s tables in 
terms of our \ measures? In this case, dy is zero, reflecting the fact that knowl- 
edge of Finley’s prediction would be no better than ignorance of it in predict- 
ing a tornado. If, however, we adjust Finley’s table so that the column mar- 
ginals are equal, while conditional column frequencies remain unchanged, we 
obtain \,* =.67. This means that if Finely’s prediction method were used in a 
world in which tornadoes occur half the time, we could reduce the error of 
prediction 67% by knowing Finley’s prediction as against not knowing it. 
We might go further and make both column and row marginals equal, obtain- 
ing \,»* =.88. The interpretation of this is the same as before except that now 
Finley is allowed to use the knowledge that tornadoes occur half the time, so 
that he will predict a tornado half the time. 

It may, of course, be cogently argued that in situations such as Finley’s it 
is misleading to search for a single numerical measure of predictive success; 
and that rather the whole 2X2 table should be considered, or at least two num- 
bers from it, the proportions of false positives and false negatives. 

We conclude this section by mentioning briefly some suggestions made by 
German meteorologists at about the same time. As early as 1870, W. Képpen 
had considered association measures in connection with his study of the tend- 
ency of meteorological phenomena to stay fixed over time. This is related to 
the problem of measuring prediction, although it is not quite the same. Kép- 
pen’s basic article on the topic appears to be [91]; an exposition is given by 
H. Meyer [108, Chapters 11 and 13] together with further references. Képpen 
and Meyer discuss the question of measuring constancy in various contexts; 
one relates to a 2X2 table with both classifications the same but referring to 
different times, and with the two marginal pairs of frequencies the same. For 
example, the table might be of the following form: 


Wind at 2 p.m. at an 
observation station 


North | Not North 


Wind at preceding 8 A.M. at North 
the observation station ———_—_——| 
Not North PNN =PNN 





PN 


In this case Képpen (as we interpret his discussion) suggests measuring con- 
stancy of wind direction between 8 a.m. and 2 p.M., with respect to the di- 
chotomy North vs. Not North, by 
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pn(l — pw) — pwn 





py(1 — pw) 


or the difference between the probability of a change from North under inde- 
pendence and the same actual probability, this difference taken relative to the 
probability under independence. 

In 1884, an article either by Képpen or someone probably influenced by him 
[92] suggested a measure of reliability between meteorological prediction and 
later occurrence in the 3X3 ordered case. The measure was 


Count DE pm 


|a—b| =1 


as in Section 8.3 of [66]. The next year, H. J. Klein [89] discussed the simpler 
measure > pas in the general a Xa reliability case. 

Bleeker [10] presents an analytical survey of the above early American and 
German suggestions in the field of meteorological prediction, together with a 
discussion of many other papers. In Section 4.10 of this paper we survey more 
recent uses of association measures in meteorology. 

3.2. Kérésy, Jordan, and Quetelet. In [85], Charles Jordan discusses meas- 
ures of association introduced by J6zsef Kérésy in the late nineteenth century. 
Kérésy wrote extensively on the effectiveness of smallpox vaccination, and he 
was led to introduce various measures of association for 2X2 tables in order to 
summarize and interpret his large quantities of data. Among the several meas- 
ures discussed by Kérésy for 2X2 tables, at least one is equivalent to Yule’s 
Q and hence to our y (see [66].) 

Jordan [85] extends one of Kérésy’s measures to aX tables. In our nota- 
tion, the extended measure is found as follows. For a 2X2 table, Kérésy had 
proposed (p1:p22)/(p:2921) as a natural measure of association. Jordan suggests 
forming all possible a8 pooled 2X2 tables out of an aX table, each of form 





Pab | Pa- —Pab 








P-b—Pab | 1 —pa-—p-o+par 








and averaging the corresponding 2X2 measures to obtain an over-all measure. 

Jordan states in [85] the maximum value for the mean square contingency 
coefficient, ¢?. (Jordan also gives this maximum value in another related paper, 
[86]. The same maximum value has also been given by Cramér [66, p. 740].) 
Jordan further discusses Kérésy’s proof and use of the fact that, if in a 2X2 
table we observe only a proportion of individuals in a column (i.e., if there is a 
probability of selection), then, providing the selection probabilities in the two 
cells of the column are equal, Yule’s Q and Kérésy’s equivalent measure are 
unaffected. This property of Q is emphasized by Yule in [149] and [150]. 
Finally, Jordan asserts priority for Kérésy’s work in the following terms: “Le 
mérite de Kérésy consiste a avdir introduit et utilisé on 1887, c.-d-d. avant 
Vavénement de la Statistique Mathématique, des grandeurs, mesurant |’asso- 
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ciation, en bon accord avec les coefficients 6 et Q de Yule et ¢? de Pearson utilisés 
aujourd’hui.” 

KG6résy’s writings are not readily available, and we have consulted only one 
of them [93]. This is a very interesting and sophisticated discussion of statisti- 
cal material on the efficacy of smallpox vaccination, in which Kérésy uses ex- 
tensively 2X2 table coefficients of association. Emphasis is on the interpreta- 
tion of such material and on the many ways in which vaccination and smallpox 
statistics might be consciously or unconsciously distorted, falsified, and biased. 
(On page 221 of the same volume in which [93] appears, there begins a fascin- 
ating discussion of a case of falsification of smallpox-vaccination data. The cul- 
prit was an anti-vaccinationist, and the detective work was done by Kérésy.) 

The question of priority in the use of simple measures of association for 22 
tables scarcely seems very important. However, it may be of historical interest 
to note that Yule, in his first (1900) paper on the subject [149] speaks of 
Quetelet’s use of a measure of association in 2X2 tables: (p11—1.9.1)/(p1.p.1), 
in our notation.' In fact, Yule named his coefficient “Q” after Quetelet [150, 
p. 586]. The work by Quetelet of which Yule writes is not accessible to us, but 
in another place [119] Quetelet uses another very natural measure for compar- 
ing (say) the two rows of a 2X2 table in a case wherein they correspond to two 
binomial populations. He simply takes the ratio of the two binomial p’s: 
(p11/p1-)/(p21/p2.). This ratio probably has been used since nearly the beginning 
of arithmetic. Of course, neither of the two measures last mentioned have the 
symmetry of chi-square or of Yule’s Q, so that perhaps Jordan would say that 
they are not “en bon accord” with the measures of Yule and Pearson. 

Biographical, bibliographical, and appreciative material on Kérésy may be 
found in a book by Saile [121] and in an obituary by Thirring [134]. A more 
recent paper by Jordan on the general question of association measures is [87 |. 

3.3. Benini. In 1901, the Italian demographer and statistician, R. Benini 
[4, pp. 129 ff.] suggested measures of attraction and repulsion for 2X2 tables 
in which the categories of the two dichotomies were the same, or closely related. 
Benini was mainly concerned at this time with the association between di- 
chotomous characteristics of husband and wife among married couples, for exam- 
ple the association between categories of civil status. Among marriages in Italy 
during 1898, Benini gives the following 2X2 breakdown of premarital civil 
status (in relative frequencies) : 


Wife 





Unmarried Widow Totals 








Unmarried . 8668 .0275 
Husband - 





Widower .0742 -0315 














Totals .9410 .0590 














1 Note that this is the same as the suggestion by Képpen mentioned in Section 3.1. 
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Comparing this with the corresponding “chance” table obtained by multiplying 
marginal frequencies, Benini observed that there clearly was association be- 
tween the premarital civil statuses of husband and wife. To measure the attrac- 
tion between similar premarital civil statuses, he suggested the following meas- 
ure (our notation): 


Pil — Pi-P-1 - P22 — p2-p.2 
Min (p1., P-1) — pi-P-1 Min (p2., p-2) — p2-p-2 





on the grounds that, when the numerator is nonnegative, the denominator 
gives the maximum possible value of the numerator for fixed marginals. The 
numerator is the usual quantity on which 2X2 measures of association are 
based. When the numerator is negative, a slight revision of the formula pro- 
vides Benini’s measure of repulsion. In the above example Benini’s measure of 
attraction has the value 


8668 — 8415 253 
——_____. = — = 479. 
8943 — 8415 528 


In 1928, Benini [5] extended his method of analysis by suggesting a separa- 
tion of the 22 population into two 2X2 subpopulations, one with two cells 
empty, and the other with all marginal frequencies equal to 1/2. Then his 
measure of attraction (or repulsion) would be computed only for the second 
sub-population. This represents one way of eliminating the effect of unequal 
marginals in comparing several 2X2 populations. (In Section 5.4 of [66] an- 
other way of attaining this goal was briefly discussed.) A variation of this point 
of view, much akin to latent structure analysis (see Section 4.9), was applied by 
Benini to sex-ratios in twins in order to estimate the fractions of fraternal and 
identical twins in the population. 

Benini’s work has been discussed by a number of Italian statisticians. An 
early discussion was by Bresciani in 1909 [15]. A. Niceforo [110, pp. 383-91] 
and [111, pp. 462-8] also considers Benini’s suggestions, and provides an enter- 
taining discussion, with many examples, of several aspects of cross classifica- 
tions. We refer in particular to Chapter 16 of [111]. A lengthy critical analysis 
of Benini’s suggestions, as applied to matrimonial association, was given by 
R. Bachi [3]. Some further articles dealing with Benini’s work are those of G. 
de Meo [31], F. Savorgnan [126], G. Andreoli [2], and C. E. Bonferroni [13]. 
Benini’s first measure of attraction was independently suggested by Jordan 
[87] in 1941, by H. M. Johnson [84a] in 1945, and by L. C. Cole [28] in 1949. 
No doubt there have been many other independent suggestions of this meas- 
ure. It has been frequently used by psychologists and sociologists in recent 
years and called, descriptively enough, ¢/@max. 

Benini’s first measure has recently been critically reviewed by D. V. Glass, 
J. R. Hall, and R. Mukherjee [63a, pp. 195-96, 248-59] in a book by these 
writers and others on social mobility in Britain. Glass et al. deal mostly with 
aXa cross classifications of father vs. son occupational status; their general 
approach is to construct a number of 2X2 condensed cross classifications from 
a larger aXqa one, with the condensed dichotomies of form father (son) in 
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occupational status a vs. not in status a. Then the 22 condensations are 
examined by looking at three of the ratios pa»/(pa.p-s). 

3.4. Lipps. In 1905, G. F. Lipps [100] discussed various ways of describing 
dependence in a two-way cross classification. For the 2X2 case, Lipps inde- 
pendently proposed Yule’s Q. For larger tables, Lipps points out that (a—1) 
-(8—1) numbers are required to describe the dependence in full; he argues 
againt use of a single numerical measure of association in these words: 
“Es ist demzufolge nicht zulissig (ausser wenn r=s=2[a=8=2 in our nota- 
tion]) einen einzigen Wert als schlechthin giiltiges Mass der Abhingigkeit 
aufzustellen” [100, p. 12]. However, he refers, in a footnote, to articles on 
correlation by Galton and K. Pearson in contradistinction. 

It is interesting to notice that, in the last section of his paper, Lipps proposed 
a quantity equivalent to Kendall’s rank correlation coefficient, 7. The quantity 
Lipps proposed is Kendall’s P =n(n—1)(r+1)/4 where n is sample size. Lipps 
suggested testing for independence by this quantity, and to implement this 
he computed its mean and variance under the hypothesis of independence. A 
year later Lipps [101] discussed 2P — (3) =(2)r. Material on Lipps’ work, and 
on other early ranking methods, is presented by Wirth [147, particularly Chap- 
ter 4, Section 28]. A discussion of the history of Kendall’s r is given by Kruskal 
[96]. 

3.5. Ténnies. The German sociologist, F. Ténnies, suggested in 1909 [137] 
a measure of association for square cross classifications in which both polyt- 
omies are ordered. A later paper is [138]. Ténnies presents his measure, which 
is related to the so-called Spearman foot-rule, in terms of continuous, rather 
than grouped, variates, but he immediately collects them into groups on the 
basis of their relative magnitudes. 

The measure, in our terminology, is found by first adding all paa’s, i.e. all 
pav'S in the main diagonal, and multiplying this sum by 2. To this is added the 
sum of all pa,’s in the two diagonals neighboring the main diagonal. Then an 
analogous weighted sum is computed for the counter-diagonal and its two 
neighbors, and this is subiracted from the first sum. In terms of a formula, 
Ténnies looks at 


[2D Cew+ OE ow ]-[2oom+ LE val. 


a—b=0 a—b=+1 a+b—l=a a+b—1l=a+1 


He compares this quantity with 2—(2/a), its maximum possible absolute value. 
Thus Ténnies’ measure is of the kind discussed briefly by us in Section 8.3 of 
[66]. 

H. Striefler [130] provides an exposition of Ténnies’ measure and suggests 
an extension. 

3.6. Deuchler. In 1914, the German educational psychologist, Gustav Deuch- 
ler [32], continued the earlier work of Lipps (Section 3.4) on the quantity 
now called Kendall’s 7. Deuchler worked on the distribution of +, both under 
the null hypothesis of independence and under alternative hypotheses, on 
methods of computing 7, and on modifications when ties are present.” 





? For further discussion of Deuchler’s work on ¢ itself we refer to [96]. For information about other aspects of 
Deuchler’s work, and for remarks about an unpublished monograph by Deuchler, we refer to [95]. A microfilm of this 
unpublished manuscript is in our hands, and we will try to make it available on request. 
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A few years later, Deuchler [33] returned to the question of multiple ties in 
both coordinates, so that he was really concerned with cross classifications 
having meaningful order for both polytomies. For this situation Deuchler sug- 
gested as a measure of association (in our notation) 


Il,< ae Tla< a) Il, — Ila 


xg = 


1 — Tegotny) 1 — TWegotny 





Here II,<(IIa<) is the probability that two randomly chosen individuals from 
the cross classified population will have their A and B categories similarly 
(dissimilarly) ordered, with a tie in one polytomy alone always counting as 
similarity (dissimilarity), but a tie in both categories—i.e. both individuals in 
the same cell—not counting in either case. Isom is the probability that two 
randomly chosen individuals fall into the same cell, i.e. are tied in both poly- 
tomies. Thus # is much like our y [66, Sec. 6] except that Deuchler has Isom) 
where we have II,. 

Actually Deuchler’s presentation is in terms of choosing two individuals at 
random without replacement from a finite cross classified population, whereas 
we in [66] give an interpretation in terms of random choice with replacement. 
For £, one obtains the same value of the measure in either interpretation, while 
y changes slightly as one goes from the with-replacement to the without- 
replacement interpretation. 

Deuchler develops his # by the same scoring scheme as that later used by 
Kendall. ® does not have quite as direct an interpretation as y, but it possesses 
one characteristic that y does not have: §t is 1 (its maximum value) if and only 
if at most one p.s in each row and column is positive and the positive p,.s’s are 
all concordant. This last means that, denoting the positive pas’s by pa,,, 
Pass, * * * , With a;<a,< ---, then b:\<bo< - --. The examples on p. 750 of 
[66] show that this property is not true for y. Note that | ®| <| |. 

Deuchler observes that ® varies as contiguous categories are pooled and he 
discusses the magnitude of this effect at length. He also compares his with 
Spearman’s rank correlation coefficient, and with the mean square contingency 
coefficient in the 2X2 case. The applications that Deuchler has in mind, and for 
which he uses his measure, are to the association between the grades of school 
children in two subjects or traits. He discusses briefly the situation in which 
one wishes to analyze such joint gradings on more than two such character- 
istics. In [34], Deuchler discusses in more detail the 2X2 case. 

3.7. Gini. In 1914-1916, Corrado Gini [56, 57, 58, 59, 60] examined in de- 
tail many distinctions between relationships within a bivariate distribution, 
and proposed a great variety of measures of association and disassociation.’ 
Examples were given to indicate the circumstances under which the various 
proposed measures might be appropriate. 

Many of Gini’s measures of association relate to cases in which the bivariate 
distribution is quantitative or can easily be made so by the use of relevant 
ordinal scores. For the qualitative case without ordering among the categories 





* We wish to thank Sebastian Cassarino, Department of Italian, University of California, Berkeley, for his 
assistance in examining Gini’s papers. 
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(sconesse categories), and where both polytomies are the same (A,=B,), Gini 
[57] proposed as a measure of association the quantity (in our notation) 


Dd pa al >opa-P-a 
VL — Sspa-2)(1 — Sop -2?) 


This is based on a sort of indirect scoring scheme, suggested by divergences of 
cell frequencies from the corresponding marginal products. In the 2X2 case, 
the above quantity is the appropriately signed square root of the mean square 
contingency. 

In [58], Gini proposed the following variant of the above measure: 


DoPas — 2 Pa-P-a . 
1—4)>0| pe. — p.a| — Dopa-p-o 


and a number of other variations were discussed systematically in [58] and 
[60]. 

We have not found in Gini’s papers operational interpretations of his pro- 
posed measures. They all seem to be of a formal nature in which consideration 
of absolute or quadratic differences, followed by averaging, is taken as reason- 
able without argument. Special attention is paid to denominators so as to 
make the indices range between 0 and 1 (or —1 and 1) within appropriate 
limitations for variation in the joint distribution. 

In [57, p. 598], Gini briefly discussed polytomies in which the categories are 
cyclically ordered, as for example the months of the year. This type of polytomy 
was not discussed by us in [66]. Gini suggested the possibility of a measure of 
association in this case, but he gave little detail. Ten years later Pietra [116] 
considered the cyclical case in great detail, and since then other Italian authors 
have written on this topic. 

The measures proposed by Gini have formed the basis of a large literature, 
mostly in Italian. We now cite several publications outlining and discussing 
Gini’s work in this area. First, Gini himself [60, pp. 1458 ff.] gave a systematic 
outline of his measures. An exposition in English of some of the Gini material 
was given by Weida [144], and a more detailed exposition by Pietra in the 
introduction of [116]. A general article on the work of the Italian school is 
that of Gini [61]; another, of a critical nature, is by Thionet [132]. (The 
reader of this last article should also refer to subsequent correspondence by 
Galvani [54] and Thionet [133].) Two recent expositions by Gini are [62] and 
[63, Chap. 9]. 

Some further references to the recent Italian literature appear in Section 4.7. 
In Section 4.4 a measure proposed by Gini in the aX2 case is discussed in de- 
tail. 











4. MORE RECENT PUBLICATIONS 


4.1. Textbook discussions. Guilford, Dornbusch and Schmid, Wallis and 
Roberts. In [73], J. P. Guilford discusses association in an aX table from the 
viewpoint of optimal prediction in a manner essentially equivalent to that of 
Guttman (see comment and reference in [66, p. 742]), and to that in which we 
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introduced \, and A, in [66]. This discussion appears in Chapter 10 of the 1942 
edition and is amplified in Chapter 14 of the 1950 edition. 

In a recent textbook [37, p. 215], S. M. Dornbusch and C. F. Schmid dis- 
cuss a “coefficient of relative predictability” for a2 tables, their G. It is equal 
to A. for a X2 tables. 

W. A. Wallis and H. V. Roberts present the \ measures and y in their book 
[142, Chap. 9]. Their notation corresponds to ours as follows: 





Wallis-Roberts Ge-r 








Goodman-Kruskal 











and their discussion is in terms of sample frequencies. 

4.2. Reliability measures. We describe now some papers on measures of 
association in the reliability context, that is when both polytomies of a cross 
classification are the same and refer to two methods of assignment. Other papers 
that deal with reliability measures will be discussed elsewhere, particularly in 
Sections 4.9 through 4.12, under other classifications. 

Wood. In 1928, K. D. Wood [148] suggested several variations of the kind 
of measure of association described in Section 8.3 of [66] where reliability for 
ordered polytomies was discussed. Wood’s suggestions related to a 4X4 table 
with pa. =p..=.25 for all a and b; they were 


p> Paay » p Pab, pe Paa — > > Pab, and > y = } - Pab- 


ja—b| <1 a+b=§ ja—b| <1 ja—b| 22 


Actually, Wood’s discussion is in terms of sample analogs, and it is wholly 
motivated by the desire to find sample functions that approximate well to the 
sample correlation coefficient. To investigate this he divides a sample into 16 
parts via its marginal quartiles, computes the above measures, and compares 
them with the sample correlation coefficient. 

Reuning. H. Reuning [120] has recently suggested a new measure of re- 
liability in the case of ordered polytomies. Reuning compares the actual pa, 
table with the table that would result if (a) there were independence between 
rows and columns, and (b) the marginal distributions were rectangular—he 
calls this the case of pure chance; its meaning is that each p2,=1/a*. Further, 
in order to use the natural ordering, Reuning suggests pooling all cells such that 
|a—b| =constant. There are a cells such that |a—b| =0, 2(a—1) cells such 
that |a—b| =1, 2(a—2) cells such that |a—b| =2, - - - , and 2 cells such that 
|a—b| =a—1, the maximum difference. Thus Reuning is led to compare 


Dd Paa with a(1/a?) = 1/a 
> «pw with 2(a— 1)/a? 


ja—b|=1 


XS pe with 2(a — 2)/a? 


Ja—b| =2 


Pla + Pal 





MEASURES OF ASSOCIATION 


In order to obtain a measure of reliability, Reuning in effect considers the fol- 
lowing x*-like quantity 


f 


No. of summandsin )> )? 
> |a—b| =k 
—1 on 2 


> ja—b| =k a 


k=0 No. of summands in )> 
|a—b| =k 





) 








a? 

Reuning also considers Zpea, a measure mentioned in [66]. 

The above: presentation differs slightly from that given by Reuning, first, 
because he works with sample instead of population quantities, and, second, 
because he emphasizes testing rather than estimation. If we regard the popula- 
tion characteristic in the above display as a general measure of reliability (and 
it is not wholly clear from Reuning’s paper whether he so regards it) some 
problems of interpretation arise, stemming from the comparison with the “pure 
chance” cross classification. For one thing, if Zpaa=1, so that reliability in the 
ordinary sense is perfect, Reuning’s measure takes the value a—1, which is by 
no means its maximum possible value. On the other hand, if pia+pai=1, so 
that reliability in the ordinary sense is about as poor as can be, Reuning’s 
measure takes the value (a?—2)/2, which is actually greater than its value 
for Lpaa=1 (unless a=2, when the two values are equal). 

The “pure chance” or uniform table as a basis of comparison had been put 
forward by Andreoli [1] in 1934. H. F. Smith [126a] uses the same device of 
pooling along diagonals as does Reuning, but in the context of a comparative 
test of two square cross classifications. 

Cartwright. D. 8S. Cartwright, for the case of unordered polytomies, has re- 
cently [19] suggested a measure of interreliability when there are two or more 
classifications, each with the same polytomy. He thinks of the common poly- 
tomy as possible judgments about members of the population on the part of 
J judges, so that 


Payag---ay 


is that fraction of the population allocated by judge 1 to class a;, by judge 2 to 
class a2, etc., where a;=1, 2, - - - , a. His measure of reliability, in our notation, 
may be written as 


2 
J(J -—1) 5 


or the probability that two different randomly chosen judges out of the J 
judges will allocate a random member of the population to the same class. 
For J =2, this becomes just Zpas. 

Cartwright’s presentation of his measure differs superficially from the above. 
He also considers distribution theory for the sample analogue of the above 
measure under special restrictive conditions. 

4.3. Measures that are zero if and only tf there is independence. The traditional 
x’-like measures of association, unlike the \ and y measures discussed by us in 
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[66], have the property that they take a particular value, zero, if and only if 
there is independence in the cross classification, i.e., pas=pa.p.». This property 
has seemed important to a number of workers, and they have proposed meas- 
ures of association with the property but different from the traditional meas- 
ures. In some cases, other formal properties have also been emphasized. We 
now discuss several such proposals that do not fit more naturally into other 
sections of this survey. 

So far as we know, none of the measures discussed here have operational 
interpretations of the kind we have argued for in [66], and indeed this is not 
surprising. For a measure with an operational interpretation measures, so to 
speak, one aspect or dimension of association. Hence, if a given cross classifica- 
tion exhibits no association along this aspect or dimension one would expect 
a zero value for the measure, even if there is association in other senses. That 
is why we are not troubled by the fact that the \ and y measures can be zero 
even though there is dependence. Note that if there is independence the \ and y 
measures are zero. This is to be expected, since independence should correspond 
to lack of association in any sense. 

Cramér. In 1924, H. Cramér [29] suggested for an a XB table the measure 


min y > > (Pas — Ua»)? 


where the minimum is computed over all numbers w, ---, Ue; 11, °° *, Up. 
This quantity is zero if and only if there is independence, and is always <.25. 
It suffers from having no definite value in the case of complete dependence. 

Cramér says [29, p. 226] that “... there is no absolutely general measure 
of the degree of dependence. Every attempt to measure a conception like this 
by a single number must necessarily contain a certain amount of arbitrariness 
and suffer from certain inconveniences.” 

Steffensen. In 1933, J. F. Steffensen [127] proposed the following measure 
of association for cross classifications: 


ad (pad mic Pa-P-b)? 
vad 2 par — pedpall — pa) 





in our notation. (See Lorey [105] for a discussion.) Apparently Steffensen’s 
motivation was to avoid certain formal inadequacies of previously suggested 
measures. For example, Steffensen points out that his y” attains its upper limit 
of 1 if and only if the two classifications are functionally related, i.e. if and 
only if exactly one p., in each row and in each column is positive. Steffensen 
gives no operational interpretation for y*?. Note that y” is an average of all 
2X2 mean square contingencies formed from each of the af cells of the cross- 
classification and its complement; in this it resembles the measure proposed by 
Jordan [85] that we discussed in Section 3.2. 

The next year, Steffensen [128] returned to y* in greater detail. (In [127] 
the measure had appeared only in a nonnumbered page of errata, as a better 
version of a similar measure, given in the article proper, that Steffensen later 
decided was unsatisfactory.) Then Steffensen suggested a variant, 
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a 2 ¥ y 2 (pab — Pa-P-b) 
> & (pab — Pa-p-») +1 — > D> pas? 


where os 8 means summation over those cells for which pay> pa.p.r. He showed 
that w, along with y’, (1) lies between 0 and 1, (2) is 0 if and only if independ- 
ence obtains, and (3) is 1 if and only if exactly one p.» in each row and column 
is positive. Finally, an extension to the case of continuous bivariate distribu- 
tions was suggested. 

Immediately following [128] an editorial [114] (presumably by Karl Pear- 
son) criticized Steffensen’s suggestions with arguiaents based on the assump- 
tion of an underlying continuous distribution. First, the editorial said that the 
continuous analogue of y? would be identically zero * cause of the presence of 
squared differentials. Then it argued that a measur? of association for cross 
classifications should not be able to attain the value unity, because, while 
complete dependence might exist between the two polytomies, it could well be 
the case that a finer cross classification would show that within the origina! 
cells complete association did not exist. These arguments were used to contrast 
Steffensen’s suggestions with the coefficient of mean square contingency, to 
the latter’s favor. The editorial concluded with a numerical comparison of y? 
and the coefficient of mean square contingency for a number of artificial cross 
classifications, and it stated that y? tends to be too low, with values crowded 
in the interval [0, .25], even for quite sizable intuitive association. 

In 1941, Steffensen [129] returned to his discussion of w. He presented a 
natural generalization to the density function case and showed that the three 
properties mentioned above still essentially held. A lengthy discussion of the 
generalized w in the bivariate normal case was given, and the paper concluded 
with a rebuttal to the arguments of [114]. 

This discussion reinforces our beliefs that it is essential to give operational 
interpretations of measures of association and that the mere fact that a meas- 
ure can range from 0 to 1 (say) is of little or no use in understanding it. 

Pollaczek-Geiringer. In 1932 and 1933, Hilda Pollaczek-Geiringer [117, 118], 
motivated by considerations similar to those adduced by Steffensen, suggested 
a measure of association for any bivariate distribution, continuous or discrete. 
The measure may also be applied, as Pollaczek-Geiringer suggested, to a cross- 
classification in which both polytomies are ordered. In our notation, the sug- 
gested measure for this case is 


DX D (AaDos — BarCev) 
a b 





w ’ 





> ¥ (AwDas + BarCas) 
a b 


-_ p Dd pas’ Ba = i > per’ 


a’sa b’<b a’>a b’<b 


“ p i Dd pare’ Da = ya > pas’ 


a’<sa b’>db a’>a b’>b 
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Pollaczek-Geiringer gives no operational interpretation. Her measure has a 
certain similarity to our y [66, Section 6] especially if it is modified by replace- 
ment of the summations with weighted sums, having p,’s as weights. 

Héffding. In 1941 and 1942, W. Héffding (now Hoeffding) presented two 
very interesting papers bearing on measures of association for cross classifica- 
tions. Héffding’s first paper on cross classifications [79] was based on a prior 
paper of his [78] that had dealt solely with the bivariate density function case. 
In [78], it was urged that measures of association should be invariant under 
transformations, monotone in the same direction, of the associated random 
variables. Several measures having this invariance were presented and their 
properties discussed. The cross classifications of [79] were considered as arising 
from underlying density function distributions by rounding. Hence their cumu- 
lative distribution functions are only known at points of a rectangular lattice, 
and their density functions are only known via averages over cells. In order to 
apply the suggestions of [78], Héffding replaced a cross classification by a 
density function distribution with constant density within each cell, propor- 
tional to its pa. (This might appear to make matters depend on metrics for the 
two classifications, but any such dependence is a notational artifact, disappear- 
ing later because of invariance.) Then Héffding applied to this “step-function” 
density the measures of [78]. The first was the correlation coefficient between 
the probability integral transforms of the marginal random variables (this is 
the so-called grade correlation, or population analogue of Spearman’s rank cor- 
relation coefficient). Héffding obtained 


3=3 x x pus] 2( YY pw.) eae || 2( 2 p) +i < i]. 


a’<a b’<b 


A slight modification gave him the more satisfactory 
e* = B/V(1 — > p.*)(1 — > pa’). 


Héffding then discussed the extrema that p and p* can reach, and their values 
for 2X2 tables. In the 2X2 case, p*? is just the mean square contingency. 

H6ffding then pointed out that his p* is the same as Student’s modification 
- of Spearman’s rank correlation coefficient [131], provided that appropriate 
notational translations are made. The article continued with a discussion of 
mean square contingency and related coefficients, including one that is a func- 
tion of the quantities 


(EE av)- (Se) Eo), 


thus giving a measure of departure from independence as defined in terms of 
cumulative distributions. 

In the later portion of [80], Héffding returned to these questions. He distin- 
guished between those cases in which a continuous distribution is considered as 
underlying the discrete distribution of interest, and those cases in which the 
discrete distribution itself is of primary interest. For this second situation he 
suggested a measure of association by analogy with one for density-function 
distributions suggested earlier in the article. It is simply 
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4 2| pe — pa-p.0|. 


A modification was then put forward, namely, division by 1— > Fo, where 


> > means summation over those (a, b) such that pa»>pa.p.». The result is 
simply related to Steffensen’s w. 

Eyraud. H. Eyraud [43] suggested for the 2X2 table the measure of associa- 
tion (p11—1-p.1)/(p1-p-1p2-p.2). He discussed its extreme values, its interpreta- 
tion, and, briefly, its extension to aX tables. In addition he considered the 
2X2 X2 case. 

Fréchet, Féron. M. Fréchet has discussed measures of association in a series 
of articles (e.g. [50] and [51]) that deal mostly with cases in which a meaning- 
ful metric exists for both polytomies. In some more recent articles, [52] and 
[53], he has studied the extent to which knowledge of the marginals restricts 
the probabilities of a cross classification. Fréchet’s work discusses the extent 
to which measures of association satisfy a set of formal criteria such as those 
mentioned earlier in this section. 

In two recent publications, [45] and [46], R. Féron has discussed measures 
of association, again with emphasis on the case when metrics are present, but 
with some consideration of the purely qualitative case. Several of the measures 
described in this section are discussed by Féron. 

4.4. Measures of dissimilarity, especially in the aX2 case. In considering an 
aX2 cross classification, it is natural to approach the question of association 
by asking about the degree of dissimilarity between the two conditional multi- 
nomial populations in the two columns, when compared row by row. This 
approach has often been taken in the social sciences when columns refer to a 
dichotomy of interest (Negro-White, Male-Female, etc.) and rows correspond 
to places, times, or the like. It is, of course, equivalent to speak of a 28 cross 
classification by simply interchanging rows and columns. 

Gini, Florence, Hoover, Duncan and Duncan, Bogue. A measure of dissimilar- 
ity in the a X2 case that has been proposed a number of tines, often in variant 
forms, is the following: 


p-2 


or half the sum of absolute differences between corresponding conditional 
probabilities in the first and second columns. The use of D appears to have 
been first suggested by C. Gini (see [56], [57], [61a]); some more recent 
publications about this measure are by P. 8. Florence [48], E. M. Hoover [82] 
and [83], O. D. Duncan and B. Duncan [42], and D. J. Bogue [12]. 

Since the summation in D, if the absolute value signs were omitted, would 
be 1—1=0, we see that 


r+ {ee ea + > - oe 
a p-1 p.2 a p.1 p-2 


where }>,* indicates summation over nonnegative values of the summand, 
and aor indicates summation over negative values of the summand. Thus 
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Oh re aes a 


and we see that D is the difference between the proportion of the population 
in column 1 appearing in rows for which pa:/p.1> pa2/p.2 and the proportion of 
the column 2 population appearing in these rows. A similar verbal statement, 
with the difference taken in the opposite sense, for rows with pai/p.1<pa:/p-2, 
corresponds to the second equality of the above display. 

Now suppose we think of redistributing the (conditional) column 1 popula- 
tion among its cells so that it becomes equal to the (conditional) column 2 
population. This means moving probability mass from the column | cells with 
Par/p-1> pa2/p-2 to those with the opposite inequality holding, and clearly the 
minimum proportion of the column 1 population that we must shift to achieve 
this goal is D. A similar interpretation may be given in terms of redistributing 
the column 2 population so that it becomes (conditionally) equal to the column 
1 population. After such a redistribution, the two cells in each row would have 
equal conditional probabilities, each conditional on its fixed column marginals. 
Also, the proportion of the population in a given row that is in column 1 will 
be the same for each row. Thus D has a useful operational interpretation for 
some purposes; for example see [42]. 

The construction of D suggests an ordering of the rows that may be of sub- 
stantive interest in some contexts. Rearrange the rows so that the row with 
maximum (pa1/p.1) — (pa2/p.2) becomes the first row, the row with next largest 
(pai/p-1) — (pa2/p-2) becomes the second row, and so on. If there are ax rows with 
Pai/p-1> Pa2/p-2, D may then be expressed as 


>» ee eh - > { m= 
a=i P-1 p-2 a=ae+1 \P-1 p-2 
in terms of the reordered cross classification. 
Some other easily obtained expressions for D are 


D > em / (eal -> = pe. / (2.1 


a=1 Pi p-2 


Pab 
— — pe. 4(1 — p. 
re — py. | / (4(1 — 04) 


| Pab — pa-p-» | /[4p.1p-2]. 
a=1 1 


The first three of these describe D in terms of absolute differences of form 
(pav/p-b) —pa-, While the last describes D in terms of: the most conventional 
measure of deviations from cell independence, p.,—pa.p.». This last expression 
for D resembles the traditional x? kind of measure, but differs from such meas- 
ures in that the absolute differences are used rather than the squared differ- 
ences, and the weightings of the terms are different. 

Still another mode of description for D may be given in terms of absolute 
differences between the column conditional probabilities, p../p.., and the col- 
umn marginals, p.». It is easily checked that 
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= Pal 
= > | —— 1 | pa-/[2p.19.2] 
Pa- 
Pa2 
——p.2| pa-/[2p.1.2] 


a=1 a- 


2 
=> <n p-» | pa-/[4p-1p.2]. 


a=1 b=1 | Pa- 


The traditional x?-like measures may, of course, also be expressed in analo- 
gous equivalent ways in the special case of two columns. For example, ¢?=x?/v 
may be expressed as 


)» > (pad — pa-p-»)?/pa-p.» = Dy Z(= — Pa ) p-»/ pa: 
->(*- my es P-1p-2 

de (* - ps) fa-/p-» 

o(= a ps) om-/os 


a= \Pa- 


1 S\ PatPa2 


we 

P-1P-2a=1 Pa- 
The possibilities of expressing a measure in terms of the deviation of pa» from 
Pa-p-s, in terms of the deviation of pai/p.1 from pa2/p.2, or in terms of the devia- 
tion of pas/pa. from p.», etc., may give added insight into the nature of the 
measure by suggesting interpretations and approaches to it from different 
directions. On the other hand, the same possibility of variant expression may 
cause confusion in communication and may mislead authors to think that sym- 
bolically different expressions correspond to different measures, when in fact 
the measures are the same. Duncan and Duncan [42] and J. Williams [145] 
discuss a number of articles where this difficulty seems to exist. The last form 
given above for ¢? has been discussed by E. Katz and P. Lazarsfeld [87a, 
p. 373]. 

Measures of association for the aX case may be based on the idea of dis- 
similarity between two columns by averaging in some way the 6(8—1)/2 pos- 
sible values of an a X2 measure of dissimilarity obtained from pairs of columns 
in the larger cross classification. Alternatively, one might average the 8 values 
of an aX2 measure obtained by comparing each column of the aX§ table with 
the column of row marginals, p,.. This approach has been used by Gini and 
by Fréchet, in references cited earlier. 

Boas. In 1922, Franz Boas [11, pp. 432-4] suggested a measure of dissimilar- 
ity between one specific column of a cross classification and the column of 
row marginals, that is between one multinomial population and the (weighted) 
average of a group of multinomial populations to which the one in question 
belongs. Boas’s suggestion, in our notation, seems to be the following: 
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Suppose that an individual is chosen at random from the bth column of a 
cross classified population in accordance with the conditional distribution for 
that column. That is, the individual falls in the (a, 6) cell with probability 
Par/p-». Now suppose that we are told the row in which he falls but not told 
that he came from the bth column. If we guess his column, based on knowledge 
of his row, in a random manner reproducing the population (as discussed in 
Section 9 of [66]), we shall guess column b with conditional probability p.s/pa., 
where a is the row in which he has fallen. Thus the probability of correctly 
guessing the cell in which the individual falls, when (i) he is in fact drawn from 
the bth column, and (ii) we guess his column, knowing only his row, in a 
random manner reproducing the population, is 


=(=)(=) - u pab*/(Pa-P-b). 


a pP-b Pa- 


That is, as we undertand it, Boas’s measure of dissimilarity between column b 
and the column of row marginals. 

Boas also considers the possibility of changing the table so that it has equal 
column marginals (see Section 5.4 of [66]). 

Long and Loevinger. In working with psychological tests made up of yes-no 
questions, one may wish to consider association between a particular question 
and the whole test. This situation may be viewed in the framework of an a X2 
table in which the columns refer to the two possible responses and the rows 
make up an ordered classification based on the whole test. The p.s’s are the 
proportions of individuals in the population falling into one of the whole-test 
score classes and responding to the individual question in one of the two pos- 
sible ways. For this special psychometric situation, measures of association 
have been proposed and discussed by Long [105] and by Loevinger [102, 
Chap. 5] and [103]. 

4.5. Measures based on Lorenz or cost-utility curves. For the aX2 cross clas- 
sification, where the a rows have a meaningful order (determined from the 
cross classification itself, as discussed in Section 4.4, or determined from ex- 
ternal considerations) the following approach has been suggested. Consider 
the partial sums 

xX.=> Po and 
i=1 P-1 


and consider the points (X., Y.) for a=1,---, @ in the unit square. The 
underlying thought is that these are points on a smooth curve expressing a 
functional relationship between X and Y, but that we only know this curve at 
the a points (X,, Y.). If there is independence in the cross classification, then 
Y,=X, for each a; i.e., the points (X,, Y,) lie on the straight line segment going 
diagonally from (0, 0) to (1, 1). But if there is association, the general shape 
of the underlying curve suggested by the (X., Y.)’s, and its “distance” from 
the diagonal line, will describe it. Several measures of association, based on this 
idea, have been suggested in the literature (see, e.g., [42], [65], [6], [41]), 
but we shall not discuss them here. In some cases, a structural assumption or 
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smoothing procedure (e.g., the use of straight line segments) is used to obtain 
a curve from the a@ points. 

4.6. Measures based on Shannon-Wiener information. MeGill, Holloway, 
Woodbury, Wahl, Linfoot, Halphen. Some time ago it was suggested to us by 
J. W. Tukey that measures of association based on the Shannon-Wiener in- 
formation function might be useful. Since we were unable to satisfy ourselves 
that such measures would have reasonable interpretations for many contexts 
in which cross classifications appear, we did not discuss the possibility in [66]. 
We wish, however, to mention here a few papers in which the information con- 
cept is used as the basis of measures of association, although we continue to 
reserve our opinions about the utility of these proposals outside the area of 
communication theory. 

Perhaps the first such paper is by W. J. McGill [107]. Soon after it, the 
approach was suggested in a meteorological setting by J. L. Holloway, Jr. 
and M. A. Woodbury [81]. E. W. Wahl [141] summarizes some of the material 
of [81]. The measure has been used in meteorology, notably by I. I. Gringorten 
and his colleagues, [72] and [69]. Two quite recent papers on this general 
theme are by E. H. Linfoot [99] and E. Halphen [73a]. 

4.7. Recent proposals by Italian authors other than Gini. We have already 
discussed the early suggestions of Benini (Section 3.3) and the extensive pub- 
lications by Gini (Section 3.7). Since then, the Italian statistical literature has 
been replete with articles about one aspect or another of the measurement of 
association. Nearly all of this literature has been derivative from Gini’s 1914-16 
publications; the interested reader can find some key references in Section 3.7. 
We shall not attempt to give a complete outline of this literature, but some of 
the more interesting articles that have come to our attention will now be listed. 

Salvemint. A prolific writer on the theme of measures of association has been 
T. Salvemini. In [122], he surveyed parts of the field, and suggested some new 
expressions for Gini’s measures in the asymmetrical and unordered qualitative 
case. In [123], Salvemini discussed the calculation and application of measures 
of association; the case in which one polytomy is ordered, while the other is 
not, received consideration. More recently, he has presented [125] an extensive 
discussion of the whole field of measures of association. References to many 
other papers by Salvemini may be found in the three articles cited above. 

Bonferroni and Brambilla. C. E. Bonferroni has given [12a] a detailed dis- 
cussion of a number of measures of association, emphasizing relations between 
the pav’s, pa.’8 and p.»’s, and pointing out problems and concepts that arise in 
the three-way cross classification. Another article by Bonferroni in this area is 
[13]. Closely associated is the work of F. Brambilla [14] who presented a 
systemic discussion of the field giving particular emphasis to the effects of 
holding marginals fixed or not and to three-way cross classifications. 

Faleschini. Particularly interesting for us is an article by L. Faleschini [44]. 
His approach is to consider the most probable cell in the bth column, and to 
compare its conditional probability with some kind of average of the column 
conditional probabilities in the same row. Thus, if a*(b) is defined by 


Pa®(b)b > Pad (all a), 
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Faleschini considers the differences 


D, = me... some average ee (b’ = 1,---, B). 
P-b p-b’ 
Finally the D,’s are averaged in some way. Thus two averages can be rather 
arbitrarily introduced. If in the first (that of the conditional probabilities) we 
weight by p.» (b’+b) and 0 (b’=b), and if in the second (that of the D,’s)we 
weight by p.», we obtain, following Faleschini, 





> Pa*(b)b — Pa*(b)-P-b 


b 1 — p.» 


Faleschini appears to feel that this kind of measure should only be used when 
Pa*(yb/p-b—> pa*»)s’/p.» for each b and b’, but we are not wholly clear about his 
intent. One difficulty with Faleschini’s suggestion is that of interpreting aver- 
ages of conditional probabilities. Nonetheless, Faleschini’s discussion [44] is in 
terms of a probability model, the drawing of colored balls from urns. 

Andreoli. Finally, we wish to mention two articles by G. Andreoli, [1] and 
[2]. Among the topics discussed is that of association between characteristics 
of one individual and a group of individuals, for example between occupation 
of father and occupations of his several sons. 

4.8. Problems of inference discussed by Wilson, Berkson, and Mainland. We 
should like to call attention to three papers in the medical literature that are of 
interest in connection with measures of association, especially with respect to 
the very difficult problem of inference from one population to another. 

The first is by E. B. Wilson [146]. Wilson emphasizes the importance of 
specifying the population carefully. For example, consider the 22 table 


Dead with evidence | Not [dead with evi- 
of cancer dence of cancer] 





Dead with evidence of tuberculosis 











Not [dead with evidence of tuberculosis] 





If this table is filled in from the data of a large number of autopsies (so that all 
individuals represented in the table are dead) one may obtain a very different 
picture than if the table is filled in from the entire population, alive at a given 
time and observed one year later. 

The second paper is by Joseph Berkson [7]. It considers examples like the 
above with emphasis on differential selection as a eause of confusion. Berkson 
proposes a specific mechanism for differential selection in the case of one study 
of the relation between smoking and lung cancer. 

The third paper is by Donald Mainland [106]. He gives in considerable de- 
tail an example showing how differential selection can lead to a grossly fallacious 
inference. 

4.9. Measures based on latent structures. We have already discussed the 2X2 
case measures of association based on latent structures that have been sug- 
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gested by Peirce (Section 3.1), and Benini (Section 3.3). Both authors sug- 
gested that the observable 2X2 cross classification might be regarded as an 
average or mixture of two or more underlying cross classifications having spe- 
cial characteristics, e.g., independence. The underlying cross classifications are 
those of the latent classes. One may then take as a measure of association a 
numerical characteristic of the latent class probabilities together with the 
averaging or mixing weights, provided that this characteristic is expressible as 
a function of the four probabilities in the observable cross classification. The 
latent class structure, which may be considered as either real or fanciful, then 
provides an interpretation for the proposed measure of association. 

Lazarsfeld and Kendall. More recently, Paul F. Lazarsfeld has written ex- 
tensively about latent class structures; it was indeed Lazarsfeld who introduced 
the term “latent structure.” Although much of Lazarsfeld’s work on latent 
structures has been concerned with much broader problems, he and Patricia 
Kendall [88, Appendix A] have discussed measures of association based on 
latent classes in the 2X2 case. We describe first their “index of turnover.” 

The sort of 2X2 cross classification that Lazarsfeld and Kendall discuss 
might result from asking people the same yes-or-no question at two different 
times. The supposed latent structure is that there are really two classes of 
people in the population of interest, those whose latent attitude towards the 
question is “Yes,” in proportion K;, and those whose latent attitude is “No,” 
in proportion K,=1—K;,. The actual answers that people give do not, however, 
always express their latent attitudes, since they may be temporarily swayed in 
the other direction, may misunderstand, and so on. Suppose that the “Yes” 
people answer “No” with probability z, and that the “No” people answer “Yes” 
with probability y. Responses are supposed independent for the people in a 
given class. Further, in order that the latent structure make sense, we suppose 
that « and y are <3. 

If, now, we choose at random a member of the population, the following 
four probabilities, arranged in 2X2 form, describe the distribution of his two 
responses: 


Second answer 


Yes No Totals 





Yes pu = Ki(1—z)?+Kyy? pa = Ki(l —z)z+Ky(1—y) pit =Ki(l—z) +Ky 
First Answer 





No pu = Kix(1 —z)+K2(1—y)y | px =Kiz*+K2(1—y)? pa’ = Kiz+Ki(1—y) 

















Totals | p*°1=Ki(l—z)+Ky p*:=Kir+K2(1—y) 1 


This is the observable 2 X 2 cross classification. Following our general approach, 
we suppose it known and postpone discussion of sampling problems. Note that 
pi=pn and that p;.=p.; (¢=1, 2). There are two independent probabilities 
among the four of the 2X2 table, and three independent parameters of the 
latent structure, so one cannot hope to express these parameters in terms of the 
probabilities. If, however, one assumes that x=y, i.e., that the probability of a 
deviant response is the same for both the “Yes” and “No” latent classes, then 
the core of the above table simplifies to 
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pu = z*—2Ki2z+K, | pi. = 2(1 —2) pi- = K,(1—22) +2 





pn =2(1 —z) | px = 2? —22(1 — Ki) +(1 — Ki) | pr. = —K\(1—22)+1-—2 


Hence z = }$[1++/1—4p,,]. Since we have assumed z<}, the minus sign should 
be chosen. Thus z =}[1—+/1—4p,.] measures an aspect of association that has 
a real interpretation in the context of the stated latent structure, since z is 
the probability of a deviant response. Also 2z(1—<) is the probability that a 
random person answers the question differently at the two times; whence the 
descriptive term “turnover.” And 1—2z(1—z) is the probability that a random 
person answers the question similarly. 
One can also easily express K; in terms of the p’s, since 


Ki = (on. — z)/(1 — 22) 


as hal + Saat Se 
2 2V1 — 4pi2 
Further, independence obtains if and only if either K,=0 or 1, or z=}. Thus 
Z measures an aspect of association, unless K,=0, 1. 
A serious difficulty with the above latent structure is that it places severe 
limitations on the p’s; only a limited set of 22 cross classifications can be fit 
by it. In fact, it is necessary and sufficient that 


(1) pw < 3, (2) pa = pai, and (3) pu => pr-p-1 


for a 2X2 cross classification to be describable in terms of the above latent 
structure. 

Kendall and Lazarsfeld also discuss a more general measure, appropriate to 
some cases in which pp, by enlarging the model to embrace three, rather 
than two, latent classes with special characteristics. In order to exemplify the 
possibilities, we should like to suggest a new measure that may be more ap- 
propriate to some cases in which p:2~p2. Which measures to use, if any, de- 
pends of course on context. The measure we shall now describe might be ap- 
propriate when two closely related questions are both asked once, rather than 
when the same question is asked twice, and we describe it in these terms. 

Suppose that on question 1 people give deviant answers (e.g. a “yes” person 
answers “no”) with probability z;<4, and that on question 2 they give deviant 
answers with probability z.<4. The probabilities of deviant response do not 
depend on the class to which a person belongs. In all other respects the latent 
structure is the same as before. We then have three independent parameters, 
K,, 2, and 2, for describing our structure, and the 2X2 table becomes 


Answer to question 2 


Yes No Totals 





pu = Ki(l —21)(1 —22) +Katize | prs = Ki(l —21)22+Kari(1 —22) pit = Ki(1—2i) + Kon 





pn = Kizi(1 —22) +K2a(1 —21)22 | px = Kiziza+K2(1 —x1)(1 —22) Po’ =Kin+K2(1 —7) 











p°1=Ki(l —22) +Kere p*2=Kiti+K2(1 —2) 1 
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We may now express K,, x, and zz in terms of the p’s; and x; and z,—thus 
expressed—are interpretable measures of association in terms of the supposed 
latent structure. They are the probabilities of deviant responses to the two 
questions. In order to get a single measure, one might take the average of 2 
and 22; that is, the probability of deviant response to one of the two questions, 
which one to be decided by the toss of a fair coin. Or one might use 2,22 
+(1—2,)(1—-:2), the probability that a random person answers the two ques- 
tions similarly. 


It is easily seen from the above table that 

a Ky pi Ky 
’ = eee 

1 - 2K, 1 -_= 2K, 


1 = 


and that 


Pll — P1-P-1 
a 2(prz + p21) 





= Ki(1 — Ki) =R (say). 


Hence 
K, = 4[{1 + V1 — 4R] 


and we see that, for our latent structure to hold, R, as a function of the p’s, 
must be <j}. Substituting in the above expressions for 2, and 2x2, we obtain 


:. as 
“= ——— 


+ 
2/1 — 4R 
2p.1 - 1 3 


i3 = 


There remains the question about sign choice in the solution of the quadratic 

for K,. We want to be able to make the same choice for both 2 and 22 so that 

x, and x, are <}. This means that p;.—} and p.,.—4 must have the same sign 

in the sense that (p:.—})(p.1—3)>0. The necessary conditions thus far sug- 

gested come to (1) pxtpu<4, (2) (o1.—3)(o-1—3) 20, and (3) pu>pr-p-1. 
Note that if P12 = p21, then Pi- =p.1, V1i=2%2, and 


te. ee ee 
1 — 4p. 1 —_ 4p 





and the minus sign must be chosen, to obtain the same result as in the earlier 
structure. So the structure now being discussed does generalize the earlier one, 
giving us two turnover indexes. 

For the present structure, independence obtains if and only if K, is 0 or 1, 
or if either 2; or z,.= 3. Thus we see again that x; and x2 measure aspects of 
association, unless K,=0, 1. 
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Necessary and sufficient conditions for the present structure to be possible 
may be expressed in various ways. One such set of conditions is the following 


pair: 
(1) o< Pil — P1-P-1 

1 — 2(p12 + pa) 
(2) (a. — $)(p.1 — 3) 2 0. 


4.10. More recent work on measures of association in meteorology. Gringorten, 
Bleeker, Brier, and others. In Section 3.1, we discussed measures of association 
suggested by Peirce, Doolittle, Képpen and others for meteorological problems. 
Meteorologists have of course long been interested in the accuracy. of weather 
forecasts, and they have suggested many measures of association between the 
predicted weather and the weather that actually occurred. 

We shall not attempt to survey the large literature of this field in detail, 
especially since three relatively recent articles provide extensive reviews of it. 
The first, by R. H. Muller [109], gives abstracts of some 55 relevant publica- 
tions prior to 1944, including most of those described in Section 3.1. (See Clay- 
ton [23] for criticism of Muller’s abstracts of Clayton’s work.) The second, by 
W. Bleeker [10], includes references to a number of continental articles not 
mentioned by Muller, and analyzes a number of proposed measures in detail, 
especially as regards the behavior of a predictor who knows that his predictions 
will be compared with actuality by a particular measure. The third, by G. W. 
Brier and R. A. Allen [17] discusses key publications appearing up to 1951. 
In the following paragraphs, we want to mention a few articles of particular 
interest to us, especially some published since the three surveys cited above. 

The simplest case of interest to the meteorologists is where there is no order 
in the classifications and an asymmetrical interest in the two classifications. 
Sometimes the classifications are different, as when one is considering a par- 
ticular qualitative variable as a predictor of qualitative weather. For this case, 
a measure of association based on the Shannon-Wiener information notion has 
been suggested by J. L. Holloway, Jr. and M. A. Woodbury [81] and has been 
used by several meteorologists, notably I. I. Gringorten and his colleagues. We 
have referred to it in Section 4.6. Gringorten [70, pp. 69-70] also suggests 
independently the same proportional prediction measure described in [66, 
Section 9]. This measure is very natural if we think of the possibility of making 
probabilistic, rather than categorical, forecasts, a possibility to which we shall 
recur in a few paragraphs. Gringorten’s article also gives a brief general survey 
of measures of association in the meteorological context. 

Sometimes the two classifications are the same, as when one is considering 
association between a categorical forecast and a categorical event, with both 
forecast and event classified in the same way. In this case of “forecast verifica- 
tion” both the above measures may be used, as well as others that take the 
identity of the two classifications explicitly into account. The use of associa- 
tion measures in connection with meteorological prediction, both with and 
without order taken into account, is considered by van der Bijl [140]. 

A more complex situation is that in which some third classification is brought 
into the picture. One important example is the three-way classification: forecast 





< Min [p1.92., p-1-2] 
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weather-—observed weather—weather at time of forecast. Here interest is 
usually centered in the extent to which the forecaster can improve on persist- 
ence forecasting or on forecasting based on climatic information conditional 
upon weather at forecast time. Some materials referred to in the above para- 
graphs bear upon this situation; we should also like to cite two articles by 
Gringorten, [68] and [71a], and a closely related report by Gringorten, Lund, 
and Miller [69]. These references use scoring schemes with scores based on 
probabilities. Gringorten [68] makes it very clear that the appropriate measure 
depends upon the question being asked. In [71], Gringorten works on the 
sampling problem for measures based on scores. 

An interesting problem is that of the construction of meaningful measures of 
association when the forecast is not categorical, but rather is itself a discrete 
probability distribution over a set of weather categories. Thus, for example, 
a prediction might be 


No rain (probability .1) 
Light rain (probability .6) 
Heavy rain (probability .3) 


and this prediction would be compared with that one of the three possibilities 
that later actually occurred. Suggestions for this kind of forecast prediction 
appear to go back at least to World War I, but it seems to have become of 
general interest only recently. Two recent articles relating to probabilistic 
forecasts are by G. W. Brier [16] and W. G. Leight [98]. 

If we attempt to construct a measure of association between probabilistic 
forecasts and the actual events later observed, we are faced with association 
between an essentially continuous distribution on a k—1 dimensional simplex 
(k categories, probabilities for each that sum to one) and a discrete distribution 
on k points (for the actual events). 

Several articles take up Peirce’s 1884 theme relating to economic losses as 
an important factor in evaluating forecast utility. For the 2X2 case, we refer 
to E. G. Bilham [9], H. C. Bijvoet and W. Bleeker [8], J. C. Thompson [135], 
J. C. Thompson and G. W. Brier [136], and G. W. Brier [18]. Gringorten 
[68 and 71a] considers more general cases by means of scores based directly 
on net losses. 

4.11. Association between species. Forbes, Cole, Goodall. In the ecological 
literature there is a series of articles dealing with 2X2 cross classifications of 
the following kind; 


NUMBERS OF AREAS IN WHICH SPECIES A AND 
SPECIES B ARE OR ARE NOT FOUND 
B 
Found Not Found 





Found Nu Ni: 





Not Found Nau Nu 
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Thus, for example, in Ny, out of nm marshes examined, grasses of species A and 
B are both found, while in N12 out of n marshes, species A is found but not spe- 
cies B. 

A review and bibliography of ecological articles dealing with measures of 
association in this context is given by Goodall [64, pp. 221-3]. The series seems 
to have started with an article by Forbes [49] in 1907, followed by a long gap, 
and then a number of more recent articles. Of these, a particularly extensive 
one is by Cole [28], in which Benini’s measure (see Section 3.3) was independ- 
ently proposed. 

4.12. Association between anthropological traits. Tylor, Clements, Wallis, 
Driver, Kroeber, Chrétien, Kluckhohn, and others. We have already discussed 
(Section 4.4) a proposal by the anthropologist, F. Boas. We now turn to a 
more special case than the one discussed by Boas, the 2X2 cross classification. 
Writers in the fields of anthropology and linguistics have long been concerned 
with 2X2 cross classifications similar to those discussed in the last section. 
The earliest paper of which we know that deals at all with measures of associa- 
tion in these fields is by Edward B. Tylor [139] in 1889. Tylor discussed many 
examples of association between cultural traits, some dichotomous and some 
trichotomous, but he contented himself with ebserving sizable apparent devia- 
tions from independence and did not suggest any numerical measures of asso- 
ciation. In the ensuing discussion Francis Galton said [139, p. 270] that 
“ ... the degree of interdependence might with advantage be expressed in 
terms of a scale in which 0 represented perfect independence and 1 complete 
concurrence.” We now list and discuss briefly those subsequent papers of which 
we know in this area that seem to us most germane to our survey. 

In 1911, Jan Czekanowski [30], explicitly carrying Tylor’s work forward, 
discussed the use of Yule’s Q in ethnology and anthropology. Czekanowski 
also published a number of further papers dealing with 2X2 classifications. 

In 1926, Forrest E. Clements and others [24] used the values of x? and the 
resulting P-values in an examination of traits held in common by various 
Polynesian societies. An interesting controversy between Clements and Wilson 
D. Wallis [25, 143] followed. Wallis attacked Clements and his coauthors for 
using oversimplifying statistical methods and for drawing unjustified anthropo- 
logical conclusions by these methods. Another article by Clements [26], dis- 
cussing Q and ¢ prefixed by the appropriate + sign, appeared in 1931. A quite 
recent article [27] by Clements goes over the same ground with added com- 
ments on subsequent literature. 

In 1932, H. E. Driver and A. L. Kroeber [38] commented on the Clements- 
Wallis controversy, and used the following three measures in analyzing associa- 
tion between various pairs of societies: 


pu f 1 1 Pu Pi 
SE+2. Be ats: 
2 \p1. p-1 V pi-p-1 1 — px 


The 2X2 cross classifications to which these were applied referred to popula- 
tions of traits, and took the following form: 
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Society B 


Has Has not 





Has pu Piz 





Society A 
Has not p21 p22 








pt p-2 











so that pi, for example, is the proportion of traits observed present in Society 
A and absent in Society B. 

In 1937, A. L. Kroeber and C. D. Chrétien [94] applied 2X2 measures of 
association to linguistic classification. Several measures were discussed and 
compared. Such application to linguistics continued in several articles, notably 
[20]. A recent article by Chrétien in this line is [22]. It is interesting to observe 
that the article immediately following [22], by Joseph H. Greenberg [67], is 
one of the few instances we know in which descriptive statistics are constructed 
so as to have operational interpretations in the sense that we have discussed. 
Greenberg’s suggestions relate to measuring concentration in a single classifica- 
tion, or multinomial, population. 

In 1939, a critical survey of the application of measures of association to 
ethnological data was published by Clyde Kluckhohn [90]. This very inter- 
esting article contains an extensive bibliography, and it marshals many argu- 
ments for and against the use of measures of association in anthropological 
contexts. 

Driver [39], in the same year, compared in detail formal properties and rela- 
tions between some eight 22 measures of association. He was much concerned 
with the effect of nonuniform marginal distributions on comparisons between 
values of 2X2 measures. 

In 1945, Chrétien [21] discussed a number of basic points, including several 
analyzed by Kluckhohn, regarding the use of measures of association. Here, 
for almost the first and only time in this line of papers, we find the problem of 
interpretation raised as Chrétien says (p. 488): “Primary in importance, it 
seems to me, is the need to determine more precisely the meaning of the scale 
of association. All association studies to date have confined their attention to 
the high positive values.” 

Finally, we wish to cite a 1953 survey article by Driver [40]. In its section 
on ethnology and social anthropology, there appears a discussion of measures 
of association for the 2X2 case. 

4.13. Other suggestions. We conclude by listing a few other suggestions relat- 
ing to measures of association that do not fall naturally into the above classi- 
fication. 

Harris, Pearson. In a number of articles by J. A. Harris and others, [74], 
[75] and [76], there is a discussion of the following situation: Sometimes the 
existence of observations (individuals) in certain cells of a cross classification 
table is arithmetically, physically, or otherwise impossible. Harris and his co- 
authors discuss the effect of this inherent emptiness of some cells on certain 
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traditional measures of association, and suggest modifications of these meas- 
ures. K. Pearson, commenting on Harris’ papers in [113], discusses the com- 
putation of the coefficient of mean square contingency when careful a priori 
consideration indicates that for certain cells the appearance of individuals in 
those cells is impossible. With the use of measures of association that have oper- 
ational meaning, rather than the coefficient of mean square contingency, the 
occurrence of zero frequencies in certain cells does not seem to us to be of special 
significance. See Sec. 2.1. The a priori considerations leading to the belief about 
zero frequencies may, however, suggest alternative ways of setting up the 
classifications that are more meaningful. 

Irwin. In 1934, J. O. Irwin [84] commented on measures of association and 
emphasized the importance of relating the use of such measures to the goals 
of the particular investigation at hand. He says (p. 87) that “ ... we should 
[not] do away with correlation coefficients or other measures of association, 
but should try to make the end point of our statistical analysis not a single 
coefficient which may be hard to interpret, but a result bearing a ‘physical’ 
meaning; the more easily the result may be understood by an intelligent lay- 
man, the better we should regard it as expressed.” Irwin ends his article by 
describing a particular case of careful and useful analysis based on measures of 
association applied to the data in various ways. 

It seems to us that, when the operational interpretation viewpoint towards 
association measures is taken, one is automatically influenced away from sterile 
arguments about which measure is “best.” For if different measures reflect 
different aspects of the population, no one is best in any abstract sense (al- 
though one may be most appropriate in a given case) and there is no reason 
why more than one should not be used. An analogy is to ask about measures 
of size for human beings. One might suggest weight, height, volume, girth, etc., 
but no one of these is best except perhaps in a particular context. 

Lakshmanamurti. In [97], M. Lakshmanamurti suggested a rather complex 
measure of association for the 2X2 case and compared it with Yule’s Q. 

Fairfield Smith. In a recent article [126a] H. Fairfield Smith has complained 
entertainingly about the difficulty of interpreting conventional measures of 
association. Most of his article shows by example how one may compare two 
sample cross classifications by forming simple chi-square tests that emphasize 
some specific aspect of possible difference between the cross classifications. 

We end this paper with a quotation [126a, pp. 72-3] that expresses Smith’s 
dismay about the vague or nonexistent meaning of most association measures. 

“What can be the use to know that ghosts in my lord’s and lady’s chambers each wore 


a sash with the symbol .6 if we do not know how the sash or its decoration may reflect 
the more earthy bodies from which the ghosts have been supposed to emanate?” 
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COMPACT TABLE OF TWELVE PROBABILITY LEVELS OF THE 
SYMMETRIC BINOMIAL CUMULATIVE DISTRIBUTION 
FOR SAMPLE SIZES TO 1,000 


Wituram J. MacKinnon 
University of Arizona 


A compact table of critical values for tests of the symmetric binomial 
cumulative distribution is presented. It covers twelve probability 
levels (.001, .01, .02, .05, .10, .20, .30, .50, .70, .80, .90, and .95) for sam- 
ple sizes to 1,000. Approximation methods of making such tests are also 
described, and notes on the theory and construction of the table are 
appended. 


1, INTRODUCTION 


HETHER a binomial probability p equals } has become an increasingly 
} beeen question in applied statistics. It is the basic question in the 
sign test initiated by Fisher [3} and in tests between correlated proportions, 
introduced by McNemar [5].! 

A notation convenient for a compact table of the binomial cumulative dis- 
tribution with p=} follows. Let s and r be the frequencies of “successes” and 
of “failures,” such that sSn/2 and r2n/2. Represent the sum, r+s, by n and 
the difference, r—s, by d. Let z be a whole number. The two-tail region of the 
symmetric binomial cumulative distribution function, the region often used 
to test whether the probability of “success” on a single trial is, indeed, }, may 
now be written: 


s(n 
P(s,n) = (1/2)""! >( ) s=0,1,---,n/2 or (n—1)/2 (1) 


z=0 x 
Throughout this paper cumulative probabilities designated by P are two-tail, 
and p, the theoretical probability of success on a single trial, is }. “The two-tail 
probability” means the chances of s or less occurrences of either type for 
n=2s+d and thus for d=n—2s. 

Dixon and Mood [2] constructed a well-known table of critical values based 
on this function; it serves at four levels of significance (.01, .05, .10, and .25) 
for sample sizes to 100. The power of the sign test, however, indicates the fre- 
quent need for large samples in its use [6], and a broad selection of probability 
levels is ordinarily a convenience. For these reasons the present paper presents 
a more comprehensive symmetric-binomial table. Despite its increased scope, 
the new table is compact because it is accompanied by special instructions and 
is constructed with an argument different from the one used in its predecessor. 
As a supplement, the paper includes an adaptation of approximation methods 
[10, 14] to the special case of the symmetric binomial cumulative distribution 
function (1). 





1 The relation between these tests or their connection with the binomial distribution has been discussed in 
several sources [7; 8; 12, pp. 335-7; 13). 
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2. A TABLE OF CRITICAL VALUES 


Table 165 presents c’s, the critical values of s, at twelve two-tail probability 
levels, the smallest a being .001, and covers all sample sizes to 1,000. Detailed 
instructions for using the table follow it. These instructions, some of which 
are not valid if n>1,000, permit the determination of the two-tail probability 
values in the sequence 0, .001, .01, .02, .05, .10, .20, .30, .50, .70, .80, .90, .95, 1. 


TABLE 165 


CRITICAL VALUES OF s FOR THE SYMMETRIC BINOMIAL 
CUMULATIVE DISTRIBUTION (n 31,000) 


See instructions at end of table. 








Two-tail Probability 





Oo 2 2 





o* 

2 

7 

15 

24 

13 36 

10 18 49 

15 25 65 

10 19 32 83 
13 24 41 104 
16 30 50 =—-126 
20 37 60 151 
24 44 71 =(178 
28 52 83 207 


coo 
NOON or NK © 





1 
2 
3 
4 
6 
7 


— i 
on 





Instructions: The table is used to test the hypothesis that the probability p of “success” on a single trial is 1/2, 
only for sample sizes n £1,000. The following symbols are used: 


s the number of “successes” or the number of “failures,” such that s Sn/2 

d the absolute difference between the number of “successes” and the number of “failures”; this difference 
equals n—2s 

P the two-tail probability of obtaining a value equal to or less than s’, a particular value of s, when d has the 
particular value d’ (or when n has the particular value n’ =2s’ +d’) 


If d’>105, then P <.001. If 2’ S105, enter the column headed by “d” and locate d’. Compare s’ with the entries 
in the same row as d’ and apply whichever of the following rules is appropriate. 

Case 1: 8’ is equal to or less than the initial entry in the row of d’. In this case P is greater than the heading of the 
column (if there be one) immediately to the left of that in which the initial entry lies and is less than the heading 
of the column in which the initial entry is located. Examples: Suppose d’ =5 and s’ =0; then .05 <P <.10. Suppose 
d’ =25 and s’ = 14; then P <.001. 

Case 2: s’ equals an entry other than the initial entry in the row of d’. In this case P is greater than the heading 
of the column immediately to the left of that in which the entry lies and is less than the heading of the column in 
which the entry is found. Example: Suppose d’ = 40 and s’ = 120; then .01 <P <.02. 

Case 8: 8’ is between two consecutive entries in the row of d’. In this case P is greater than the heading of the col- 
umn in which the left member of the two entries lies and less than the heading of the column in which the right 
member appears. Example: Suppose d’ = 20 and s’ = 200; then .30 <P <.50. 

Case 4: 3’ is greater than the final entry in the row of d’. In this case P is greater than the heading of the column 
in which the final entry lies and less than the heading of the discontinued column (if there be one) immediately to 
the right of that in which the final entry appears. Examples: Suppose d’ = 90 and s’ = 400; then .001 <P <.01. Sup- 
pose d’ =2 and s’ = 127; then .95 <P. 

* The only exception to the rule for Case 1. For d’ =2 and s’ =0, P =.50 exactly. 

t The only exception to the rule for Case 2. For d’ =8 and s’ =0 (the second 0 in the row), .001 <P <.01. 


(Continued on next page) 
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TABLE 165 (continued) 








Two-tail Probability 





ae .30 





33 
38 
44 


56 
63 


78 
85 
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TABLE 165 (continued) 





Two-tail Probability 





05 .10 .20 .30 .50 .70 = 8«©.80~=6.90 











One-tail tests may be made at some conventional levels of significance, e.g., 
at the .01 and .05 levels, by using the columns headed .02 and .10. For the .01 
and .05 probability levels, approximate critical values may be obtained for 
n>1,000 by substituting respectively 1.2879 and .9800 for k in (n—1)/2 
—k/n+l1 [2, p. 560]. 

Each entry was determined and then verified either by use of a table different 
from the one from which the entry was originally derived or by a checking 
calculation whenever the entry depended upon considerable computation. It is 
believed, therefore, that the entry in each case is the true critical integer. 
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The tests are made in a minimum of time, since the footnote rules easily 
permit the user to select the specific procedure for his problem. The use of d 
rather than n as the argument reduces considerably the space required for 
tabling, 105 rows being sufficient. 

Especially for small values of d, however, P and a may differ somewhat, the 
critical values always yielding P’s on the “safe side.” The table, furthermore, 
has limits on both sample sizes and probability levels. 


3. OTHER METHODS 


Because of such limitations, the research worker will sometimes seek other 
techniques for making tests involving the symmetric binomial cumulative dis- 
tribution. The cumulative probabilities themselves are available to 7 decimals 
for samples to 150 [1], and the corresponding central-region probabilities ap- 
pear to 5 decimals in a special table for samples to 200 [15]. 

It can be seen from a recent survey [10] that the two most suitable approxi- 
mation methods when p=} are the simple normal approximation with the cor- 
rection for continuity, and the more laborious but empirically more accurate 
Camp-Paulson approximation. We present these two approximations and their 
maximum errors in special forms suitable only to the case in which p=}. 

The two-tail probability may be obtained by the normal approximation with 
the correction for continuity by doubling the area under the normal curve to 
the right of 


d—-1 
Vn 
The maximum error (maximum departure of the approximate probability from 


the true binomial value) of the two-tail probability associated with (2) is less 
than 


(2) 


a= 


059 
jo (3) 
n 


Therefore, if n>58 one is assured that the maximum error of a two-tail proba- 
bility based on (2) is less than .001.? 
The two-tail probability is obtained through the Camp-Paulson approxima- 
tion by doubling the area under the normal curve to the right of 
ee [r/(s + 1)]"*[9 — 1/r] + 1/(8 + 1) - 9 
* BE [r/(s + 1) ]?*[1/r] + 1/6 + YD} 
The maximum error of the two-tail probability based on (4) is less than 


ae .029 (5) 
a 7 





(4) 


That 5, exceeds 6, for n=5 contrasts with the empirical generalization that 
22 provides greater accuracy than z,. Furthermore, if n=5 the absolute maxi- 





2 Expression (3), resulting from materials in the book by Smith [14, pp. 6, 25], is more restrictive than a com- 
parable expression, .560/ yn [10, p. 295], and, where comparisons are possible, provides values smaller than com- 
parable maximum errors derived from a table [10, p. 295). 
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mum error of the two-tail probability associated with z2 is less than .01, whereas 
that associated with 2 is less than .10 [10, pp. 295, 300-1]. 
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APPENDIX A 
BASES OF THE RULES FOR USING TABLE 165 


The various rules which appear in the instructions for Table 165 rest upon 
the nature of the symmetric binomial cumulative distribution function and 
upon the scope or range of the entries. 

The rules for locating P between two consecutive values in Table 165, or as 
less than .001 or greater than .95, depend upon the monotonic increasing nature 
of the symmetric binomial cumulative distribution function. Yet it is not 
obvious that 


Pizss'|\d=d;n=d' +28’ =n} (6) 


increases monotonically with s when d is held at d’. The reason the monotonic 
increase is not apparent is that if s increases from s’ to s’+1, n simultaneously 
increases from n’=d'+2s’ to n’+2=d’'+2(s’+1). In other words, P(s, n) 
changes from P(s’, n’) to P(s’+1, n’+1). Thus it is necessary to prove the 
monotonicity. 

THEOREM 1. For each increase in s from s’ to s’ +1, where s’ S$ (n’ —1)/2 and 
n' =2d'+s', the probability function P{x<s|d=d’} will increase monotoni- 
cally by (n’ —2s’ —1)n’!/(s’ +1) !(n’ —3s’) 12”. 

PROOF: To investigate monotonicity we analyze the probability 


a/+1 ’ 2 
P{x s (s' +:1)|d@’} = (1/2)"" © (" 7, ) (7) 


x 


z=0 


If the summation in (7) is expanded into individual terms, if each of these is 
replaced by two via the familiar relation among combinatorial coefficients 


Oe sat HE 8 


if each resultant term similarly is replaced by two, and if the final combinatorial 
symbols are re-grouped, the right-hand side of (7) becomes 


s’ , s’—1 , a/+1 , 
wa f8()-E(°) +8) 
z=0 x z=0 x z~0 x 
- BC+ £C)-C)+80) +6 
z=0 \& z= \ ZL 8 z=0 \Z s'+1 


3 Where comparisons are possible, error maxima for the Camp-Paulson approximation (4) which are derived 
from a table by Raff [10, p. 300] are smaller than comparable values computed from (3), which relates to the normal 
approximation, and are smaller than similar values based on (5), which is available from the same source [10, p. 300], 
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- wre E()+O4)- CO) 


Pu , ‘y A | 
= (1/2) >(") 4 ayes] ssi = ] 


(s’ + 1)!(n’ — s’ — 1)! re s'!(n’ — 8’)! 





z=0 


Thus, referring to the left-hand member of (7) and to this last expression, we 
have 


(n’ — 2s’ — 1)n’! 


(s’ + 1)!'(n’ — 8’) 12+ 





P{x < (s' +1)|d’} =P{xss'|a’} + 


The final term in (8), expressing the difference between the.two probabilities, 
will be positive if s’ <(n—1)/2 and Oif s’ = (n—1)/2. Thus Theorem 1 is proved. 

In Table 165 it is universally true that c<(n—1)/2. Therefore the proba- 
bility function (6), for which critical values are presented in Table 165, in- 
creases with each unit increase in s for any value of s up to and including s=c. 

The rule or procedure involving initial entries in rows stems from the fact 
that, for a few small values of d, any value of s(s20) may imply P{z Ss| d}>a. 
In these cases no c exists, as in the spaces preceding some initial entries in 
rows. 

The procedure involving final entries in rows is appropriate because the first 
missing entry at the end of each column and each subsequent entry is such 
that 2c+d> 1,000. (It can be proved that c’s increase monotonically down 
columns.) Thus, given n<1,000, an assumption behind the procedures, and 
provided d<105, any obtained value of s must be smaller than any missing 
entries on the right-hand side in the row of the corresponding obtained value 
of d. 

The procedure or inference when d>105 has a similar justification. If 
n=1,000 but d> 105, it follows that s<447. Now 447 is c for d= 105, a=.001. 
Accordingly, in consideration of the monotonic increase of c’s down columns 
and the restriction that n $1,000, any obtained value of s must be less than any 
untabulated ¢ for d>105, a=.001. Hence P{x<s|\d>105} <.001, given 
n = 1,000. 


APPENDIX B 
METHODS USED TO DETERMINE ENTRIES FOR TABLE 165 


The crucial step in constructing Table 165 therefore consists in finding for 
each combination of d and aa value c of s such that the first term on the right- 
hand side of (8) is less than or equal to a and the term on the left-hand side of 
(8) is greater than a. These two probabilities were calculated independently. 

Up to n=149 critical values could easily be ascertained from seven-place 
cumulative binomial probabilities [1]. These entries (and others up to n =200) 
were checked with the aid of five-place probabilities of the central region of the 
symmetric binomial cumulative distribution function [15]. In three cases 
where it appeared from these tables that P =a, use of tables having more than 
five-place accuracy [1, 11] resulted in Pa. Indeed, there is only one entry in 
Table 165 (c=0, d=2) for which P=a. 
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For n= 150 interpolation was required in five-place tables [4] between the 
total sample sizes tabled there, say n,. Put n=2s+d=n,+k(0<k<49), and 
put y=n—z. Then 


Pix <s|n} = Ply2r|u+k}, (9) 


where the circumflex differentiates interpolated probabilities from true proba- 
bilities. A recurrence formula for interpolation [4, xxxi] is now adapted as 


k 
Ply=ri|n+k} —2> e(k, )E(m, r — 4), (10) 
i=0 
where e(k, 7) refers to the tabled probability of i “successes” in k tries [9], and 
E(n,, r—1i) refers to the tabled probability of at least r—7 “successes” in n, tries 
[4]. Hence by (9) and (10), 


k 
P{xss|n} =2> ek, i)E(m, r — 7) (11) 
i=0 

Next let e:’, e2’, - - - , €x4:’ denote the actual rounding errors of the respective 
tabled individual terms @, é2, - - - , éc41, and let E,’, 2’, - - - , Exs:’ denote the 
actual rounding errors of the respective tabled cumulative terms Fi, E2, - - -, 
Ey4; which are paired in order with the individual terms above in a particular 
interpolation according to (11). 

Of the four summation terms obtained by the multiplication 


k+1 


2> [Ei + Ei [ei + e] 


i=l 


k+l k+1 k+1 k+1 
= 2 > Fe + D> Fe’ + HE’. +>d Eve! | (12) 
i=l i=l i=l t=] 


the first term is the interpolated value, and the sum of the three remaining 
terms is its error. 

Using max E,/=5.1X10-* [4, xxi], max e,’=10-7 [1, vi], max 2;=1, and 
> tle; =1, the absolute maximum error is found to be less than 2.03 x 10-5. 
A slightly smaller value which exceeds the maximum error and which is relative 
to a given interpolation problem may be obtained. It is found by using 
10-7 0 FH; instead of (k+1)10~7 as an upper extreme value for the second 
term within brackets on the right-hand side of (12). 

By reference to the absolute or the relative maximum error, most of the criti- 
ca! values could be ascertained. Let € be either 2.03 X 10-* or the smaller value 
just mentioned for P\x<s\d}. If P{a<s|d} <a—e, then P{x<s|d} <a, but 
if P{a<s|d}>ate, then P{xr<Ss|d}>a. 

When neither of the hypotheses of these implications obtained, recourse to 
fifteen-place factorial tables [11] resulted in definitive entries. 

The method consisted of continuing calculations involved in obtaining the 
cumulative probability until the comparative size relation of P{z<s|d} and 
a became unequivocal. This size relation is determined when for some integer 
A(0<ASs) 


2" la 





(13) 
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or for some integer y(0<+y 3s) 


> 1 1 Q-la 
[= ic eee af (14) 





a. zi(n — x)! 


where, of course, n = 2s+d. 

In both (13) and (14), n!/2"—"' has been transferred to the right, and reverse 
summations are involved on the left. 

The second term in (i4) is an overestimate obtained by multiplying the final 
calculated term of the summation by the number (s—~) of remaining uncalcu- 
lated terms, each of which would necessarily be smaller than the final calculated 
term. In use of both (13) and (14), the maximum total rounding errors were 
inconsequentially small with respect to reversing inequalities. 

If (13) is true, P{z<s|d} >a, since P{x<s|d} is greater than the left-hand 
member of (13) altered by transposing 2"~'/n to the left. Similarly, if (14) ob- 
tains, P{x<s|d} <a, for P{x<s|d} is less than the left hand member of (14) 
modified by transposition of 2"-'/n to the left. Thus the binomial critical value 
c was determined. 

All the calculations for n2150 were repeated to check the original results, 
of which a detailed record had been made. 
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THE FITTING OF STRAIGHT LINES WHEN BOTH 
VARIABLES ARE SUBJECT TO ERROR* 


ALBERT MADANSKY 
RAND Corporation 


Consider the situation where X and Y are related by Y=a+,sX, 
where a@ and 6 are unknown and where we observe X and Y with error, 
i.e., we observe s=X-+u and y=Y-+v. Assume that EFu=Ev=0 and 
that the errors (u and v) are uncorrelated with the true values (X and 
Y). We survey and comment on the solutions to the problem of obtain- 
ing consistent estimates of a and 8 from a sample of (z, y)’s, (1) when 
one makes various assumptions about properties of the errors and the 
true values other than those mentioned above, and (2) when one has 
various kinds of “additional information” which aids in constructing 
these consistent estimates. The problems of obtaining confidence inter- 
vals for 8 and of testing hypotheses about 8 are not discussed, though 
approximate variances of some of the estimates of 8 are given. 


. INTRODUCTION 
1.1. Regression, Structural and Functional Relationships 
1.2. Least Squares and Maximum Likelihood Estimation 
. THe METHOD or GROUPING 
3. Use or INSTRUMENTAL VARIABLES 
3.1. Two Linearly Related Instrumental Variables Observed with Error 
3.2. One Instrumental Variable Observed Without Error 
. Use or VARIANCE COMPONENTS 
4.1. Replication of Observations...... 
4.2. The Method of Grouping........ 
2. - Tian ok Reabenainl, Wamiailen....s:s ass. os oye s0s.0'es wai’ on od wid seins 
. THe Berxson Move. 
. EsTIMATION vIA CUMULANTS 
. ESTIMATION IN IDENTIFIABLE CASES 
. AN EXAMPLE 
eS, S43, Seats ny CRE Ga a SE Se ea OR leas dt cle 
EFERENCES 


1, INTRODUCTION 


F A physicist, say, were to give a statistician a set of observations on two 
I variables, tell him that only one of the two variables is subject to “error” 
(where this “error” may be due to either errors in observation or random varia- 
tion, or perhaps both), and ask him to “fit a straight line to the data,” the 
statistician would only have to know how the observations were obtained, 
certain properties of the variables, that a linear relation exists, and the use to 
which the line is to be put, and, in the light of this information, he could fit 
the desired straight line to the data. If the same physicist were to come to the 
statistician with a set of observations on two variables and were to say that 
there were errors made in observing both of the variables, he would be surprised 
to hear the statistician request, in addition to the information mentioned above 





* This paper is an outgrowth of a Master’s Thesis submitted to the Department of Statistics, University of 
Chicago. I am indebted for helpful comments and criticisms to T. E. Harris, W. H. Kruskal, L. J. Savage, and 
especially to D. L. Wallace, I also wish to thank Arthur Stein for making available to me the data used in Section 8. 
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(which the physicist is presumably able to give the statistician quite readily), 
either more information on the range or standard deviation of the errors, ob- 
servations on a third set of variables related to the other two, or replications of 
. observations on each independent variable. In addition, he would be astounded 
to see the statistician quake if he were to mention that he believed the errors 
to be distributed normally. 

To the physicist’s eye, the situation in which both variables are subject to 
error does not seem to be quite as intractable as the statistician makes it out 
to be. If he plots the data, he could certainly plot an “eye-line” which, although 
it may not be the “best” line (in some sense), may be pretty close to the best 
line to be found by a statistician. If he knew of the method of least squares, he 
might argue that there are cases where for a large sample the estimate of the 
slope of the line relating Y to X lies between the least squares estimate of the 
slope of the line relating the observed y to the observed x and the reciprocal of 
the least squares estimate of the slope of the line relating the observed x to 
the observed y,' and hence an averaging of these two quantities calculated from 
the observations would lead to an “estimate” of the true slope. Hence to the 
physicist the only job left for the statistician is to make a more precise estimate 
than the one the physicist has already made, that is, use an estimator with 
known “nice” properties (e.g., consistency), and attach a standard error to 
the estimate made. 


Assuming that the physicist’s initial shock upon hearing that although he 
has an “estimate,” the statistician doesn’t, is over, and that the physicist is 
willing to hear what the statistician has to say for himself, what sort of things 


would the statistician like to know? As in the situation in which there were no 
errors in either variable, or where only one variable was subject to error, he 
would like to know the answers to some preliminary questions designed to 
give him an understanding of the problem. First of all, he would like to know 
something about the situation out of which the observations arose, e.g., were 
the observations random pairs? Is the underlying linear relationship symmetric 
in X and Y (in a sense to be defined later) or not? Secondly, he would like to 
know the use to which the linear relation is to be put, e.g., does the physicist 
merely want an estimate of the parameters of the linear relation? Or is he trying 
to predict something by using the relation? Or does he want to test some 
hypothesis about one or more of the values of the parameters of the linear 
equations? Finally, the statistician would like to know the characteristics of 
the underlying true variables, e.g., are they fixed numbers? Or are they ran- 
dom? If so, what can we assume about their distribution? With this preliminary 
information, the statistician can then determine the type of relationship in 
which the physicist is interested . . . and then it only may be possible to obtain 
a consistent estimate of the linear relationship. Besides this orientation informa- 
tion which the physicist is prepared to give the statistician, the statistician, as 
we shall soon see, will probably also need technical information about the 
errors which the physicist is probably unprepared to give, because he doesn’t 
expect to be asked for such informaticn. 





1 For a proof of this sometimes useful fact, cf. [16] and [29]. 
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1.1. Regression, Structural and Functional Relationships 

What are the possible situations out of which the observations arise? One 
situation is the following: Let X and Z be the true values which we are trying 
to observe. Suppose the distribution of Z given X is normal with mean a+8X 
and variance o”. Then, regardless of whether X is a random or a fixed variable, 
we can write, for fixed X, Z| X=a+8X-+t where ¢ is normally distributed 
with mean zero and variance a?. (So far, this is the ordinary linear regression 
situation with no errors in observing either variable. Rather, Z is subject to 
random variation.) However, we do not observe z=Z or x=X but instead 
z=Z+v and z=X-+u. If we let Y=a+ 8X, then our observation on Z for a 
given X (but not for a given x) can be written as y= Y+t+v. It is assumed 
that Lu = Ev=0 and that u, v, and ¢ are uncorrelated with each other and with 
X and Y. This situation is usually called the regression situation [21, 27]. It is 
essential that 10 for this model. The case where t=0 is considered separately. 

One important property of the situation when X is a random variable is its 
asymmetry. By this I mean that although Y= E(Z| X)=a+ 8X, the expres- 
sion X =(Y—a)/8, a result of algebraic manipulation of the original equation, 
is not a meaningful relation in this context. The only meaningful “inverse rela- 
tion” here is Y’=#(X|Z). In particular, when the joint distribution of X and 
Z is bivariate normal, then F(X | Z)=7+6Z, which is not the result of solving 
the equation Y=a+ 8X for X. Another important point to note is that when 
we write y= Y++v, we must distinguish between ¢ and v. The variable v is 
an error of observation which we presumably may be rid of by making finer 
and finer observations, whereas even if we were rid of v, we would never be 
rid of ¢. The variable ¢ is inextricably tied up to the distribution of Z given X. 
The variable v is what Tukey [45] would call “fluctuation” and ¢ is what he 
would call the “individual part” of an observed quantity. Finally, one should 
note that it makes no difference in the case in which there is no error in X 
whether we fix X and observe Z| X or choose random pairs of observations, for 
in either case, in the relation Y=E(Z| X)=a+ 6X, X is not treated as a ran- 
dom variable in considering the linear relationship of interest, namely the 
expectation of Z given X. 

When X is observed with error, we can still select our pairs of observations 
either as random pairs or by fixing the value of x and observing the correspond- 
ing y. For example, suppose we wish to estimate 8, the density of iron, by mak- 
ing use of the relation MASS=8 VOLUME. We can either select pieces of iron 
at random, so that our pairs of observations are random, or select pieces of 
iron of predetermined volumes, where the volumes are measured by some tech- 
nique which yields the true volume of a piece of iron plus some random error. 
In this case, it makes a great deal of difference whether or not we can fix x in 
obtaining our observations, as we shall see later. 

Another situation in which one might be interested in a linear relation 
between variables X and Y is the following. Let X and Y be the true values 
which we are trying to observe, and let them be linearly related by Y=a+,X. 
In this case X, and hence Y, may be either random or nonrandom variables. 
We observe y=Y+v and r=X+u. Again, it is assumed that Hu=Ev=0 and 
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further that in situations where X and Y are random variables, EYv=EXu=0. 
If X is a random variable, the relation Y=a+ 8X is usually called a structural 
relation. (However, Lindley [27] calls this a functional relation. It is in this 
situation that “confluence analysis” [17, p. 525] is applied.) Y=a+¢X has 
been called a functional relation by Kendall [21, 22] when X is not random. 
One immediately notes that this latter situation is a degenerate case of the 
regression situation, in that here t=0. One also notes that this situation is 
symmetric in that X=(Y—a)/8=y+é6Y is an equally meaningful way of 
writing the relation Y =a+X in this context. 

In any given situation, though, it may be difficult to determine whether to 
treat the relation as structural or functional. In the above example, for in- 
stance, we might interpret the linear relation as a functional relation by as- 
suming that the pieces of iron we use are not a random sample from the popu- 
lation of pieces of iron, but merely what iron we had available and that, for a 
given piece of iron, there is a true mass and a true volume which we observe 
with random error due to the inaccuracies of our measuring devices. On the 
other hand, we might assume that the iron we had available was a random 
sample from the population of pieces of iron, so that the true mass and true 
volume of any piece of iron is a random variable. The determination of whether 
one treats the relation as structural or as functional depends on what sort of 
inferences one wishes to make. In this case, our treatment depends on whether 
we wish to estimate the density of iron or the density of the iron in our back- 
yard. 

What are some of the reasons for wanting to estimate a and 8? We may be 
interested in estimating a and # because we are interested in the structural 
relation between two variables and hence these values are of intrinsic interest 
to us. For example, we may be considering the relation MASS=8 VOLUME 
for a given element and desire to estimate 8, the density of the element. Or 
we may have some hypothesis about the values of a and 8 and might need 
estimates of these quantities for use in the statistic to be used to test this 
hypothesis. Or we might be interested in estimating a and £6 for predictive 
purposes. That is to say, we may at some future time want to observe a value 
of X without error and use the relation Y=a+ 6X to predict Y from this X. 
From the symmetry of the structural and functional relations, if we want to do 
so, we can also observe a Y without error and predict X from the equation 
X =(Y—a)/8. This, however, cannot be done in the regression situation, as 
has been pointed out earlier. 

One should note that there is another problem in this context which is also 
called the prediction problem. This is the situation in which one can never 
hope to observe X or Y without error, and hence is only interested in predicting 
Ey=Y for a new observed x = X +. But this is just the case in which the least- 
squares regression of y on x works, for our independent variable is no longer 
X but instead xz, which is observed without error. In this case, the statistician 
has no difficulty in estimating the parameters of the linear relation of interest. 
(Cf. [11, 27, 48] on this point. The confusion here is a result of and a good exam- 
ple of the confusion between regression and structure-function.) 
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1.2. Least Squares and Maximum Likelihood Estimation 


I have spent some time outlining the preliminary information necessary to 
the statistician before he can undertake to estimate a linear relation between 
two variables observed with or without error. But once the statistician has this 
preliminary information, thereby ascertaining what type of linear relationship 
obtains, he still cannot estimate the linear relationship when both variables 
are observed with error. 

Let us say that we observe z;=X,;+u; and y;=Y,;+v; where Y;=a+8Xi, 
and assume that Lu;= Hv;=0, that the errors (u; and v,;) are uncorrelated with 
each other and with the true values (X;, Y,), that our successive observations 
are independent, and that Var X;=cx’, Var u;=o,?, Var v;=0,?, Var 2;=<,’, 
Var y;=¢,?, and Var Y;=cy’ for all 7. Then using ordinary least squares tech- 
niques (i.e., minimizing }\w,(yi—a—zx,)? where w; is the reciprocal of the 
variance of y;—a— x; given 2, i.e., w;=1/¢,) is not correct, for use of this 
method yields efficient, consistent estimates of 8/(1+-(¢.?/ox?)), not of B. By 
using ordinary least squares techniques, we are only minimizing “vertical” 
error, error in the y direction. Our situation is such that we also have “horizon- 
tal” error which should be taken into account in estimating 8. To use least 
squares estimation correctly, as Lindley [27] points out, one should take ac- 
count of both errors by minimizing 


n 


> wi(8)(yi — a — Bx;)? 


i=1 


where the w;(8)’s are proportional to the reciprocals of the variance of y;—a 
— Bx; given X;, i.e., w,(8) =k/(o.?+8o.”), where k does not depend on i. 

If we knew A =<,?/¢,?, then Var (y;i—a—8x;) =0,?+8?o,? =(A+8?)o.2. Hence 
if w,(8) =1/(A+6?), we could minimize > w.(8)(yi—a— Ba)? with respect to 
8 quite readily, and obtain equation (3) below as our estimate of 8. Lindley 
points out that this is the same estimate as that obtained by minimizing the 
distance between (x;, y;) and (X,, Y,) for all 7. The method of weighted least 
squares, with weights depending on 8, will also give an estimate of 6 if either 
o,” or o»’, or both, rather than \=¢,?/¢,?, is known. If one assumes that X, Y, 
u, and v are each normally distributed, with Lu=Ev=Ew=EXu=EXv 
=EYu=EYv=0, then, as will be seen later, the method of maximum likeli- 
hood will give the same estimate as that obtained by the method of weighted 
least squares, whatever one of the above assumptions is made about o,” and 
oy”, but not without one of these assumptions. Thus, to use standard statistical 
techniques of estimation to estimate 8, one needs additional information about 
the variance of the errors. 

Let us see exactly where the difficulty arises in using the aforementioned 
techniques. I shall consider the structural relation in detail here. The analysis 
of the regression situation is given by Lindley [27], and of the functional 
relation by Kendall [21]. 

Suppose we observe a random sample, (2%, y1), - - - , (tn, Yn). Let Ex; = EX; 
=p, Ey,=EY,;=E(at+BX,)=at+fBp, o,?=0x'?+0,7, o,?=8ox?+o,7, and 
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Cov (x:, yi) =Bex’. Let the x; and y; be normally distributed with these param- 
eters. We then have six parameters, namely yp, ox’, o,”, o.?, a, and 8. (Note 
that we already have assumed Cov (u;, v;)=0.) Our sufficient statistic is a 
quintuple, namely (>°z, }-y, >-2?, >-y?, Dozy). We are mainly interested in 
estimating 6=Cov (X, Y)/cx®?. But we can only estimate yp, a+ fy, o,? 
=ox’+o,’, o,7=cy?+<e,", and Cov (z, y) =Box*=Cov (X, Y). The maximum 
likelihood estimates of the parameters of the distribution of (X, Y) are 
p=%, @=j—fs, 6x*=6,?—6,", éy?=6,?—6,", and Cov (X, Y) =Cév (az, y). 
But cy?=6cx’, so, disregarding the equations in @ and ~, our equations be- 
come: 


and 
Cov (2, y) = Bex’. 


We therefore have three equations in four unknowns, namely 8, éx?, é,2, and 
é,”. Hence if we knew either ¢,”, ¢,”, or o,?/o,? and were sure that Cov (u;, v;) 
=0, we could estimate 8. With this estimate of 8, we can always estimate a 
by 4=j—62. 

If o,? is known, we have from the above equations 


A 


n 
p (yi — 3)? — no,? 
i=1 


n 


Dd («i — (ys — 9) 


i=l 


If o,? is known, 


yy (zi — #)(yi — 5) 
B= Pi 


n 


. (a; — £)? — no,? 


i=1 





If \=o,?/o,” is known, 8 is estimated by 


” 


> (yi—5)?—A > (2;—2)*+ {| p> (yi—9)?—d > (1:—# ‘| 


i=! i=l i=l i=l 


t=1 





i“ 2) 1/2 
+4] ¥ @.- 9-9) | \ 


; - 5 (3) 
2 > (2i—4)(yi—3) 


(Lindley’s estimate [27] is incorrect. See Appendix and [6].) One can easily 
verify directly that the method of weighted least squares estimation yields 
the same estimate of 8. Smith ([40], p. 12) gives an elegant verification of this. 

If both o,? and o,” are known, we need not compute dA but instead can use 
these bits of information separately and obtain another estimate of 8, namely 
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where sgn (8) =sgr [>oziyi— oxi Dy./n]. 

This can easily be derived from the above equations (cf. also [39]). Here, 
however, neither (1), (2), (3), nor (4) are maximum likelihood estimates of 6. 
The oul optimal property these estimates are known to have i ¥ consistency. 
The solution of the maximum likelihood equations for 8 and éx?, when both 
o.” and z,* are known, hasn’t yet been determined. 

If both ¢,? and o,” are known, we may assume that Cov (u, v) #0. We then 
have three maximum likelihood equations in three unknowns, and (4) above 
is the maximum likelihood estimate of 8. In this case, 


Cév (u,v) = Cév (2, y) 


1 “ n 
oe /( ee (yi = 5)? oat no.*)( } (x; = £)? — no.?). 
° ; i=l 


i=1 = 





It has been suggested (Allen [2]) that knowledge of \ is better than knowledge 
of either oc,” or o,” or both, for if, for example, ou is known, it may be that 
é,”, being a random vatiable, will be less than o,?, an absurd soni. If, on the 
other hand, only \ were known, we would not be led to such an absurdity. 
However, if we modify estimates (1), (2), and (4), so that if 


n 


>> (yi — 9)? — no? <0 


i=l 


we take §=0, and if 


n 
>, (a; — #)? — no,? < 0, 


i=1 


we take 8= «, no absurd results will be obtained. 

The case in which both o,? and o,? are known is an over-identified situation 
(since only knowledge of their ratio is necessary for identifiability). In this 
case, it would seem reasonable to use all the available information in the hope 
of achieving a small variance of the estimate of 8. The example of Section 8 
bears out this contention that estimate (4) is the “best” one (in the sense of 
smallest variance) to use when both o,? and g,? are known. 

To obtain the asympiotic variances of estimates (1) and (2), one can use the 
formula of Section 6, noting that essentially these estimates are ratios of 
cumulants. Also, one can easily see that the asymptotic variance of estimate 
(4) is (48) times the asymptotic variance of the square of estimate (4), which 
can be computed via the formula of Section 6. 

Creasy [6] has shown that 


2ty(n—2)M2V/ Do (i= 8)? DS (Yi— 37-1 @—Hi—I)]?_ 








Vn—2V [> (a,—#)? - = > (yi—5)? )? 2+4[ >> “(2i—2)(ys—9) |? 
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are 1007% confidence limits for ¢=tan-! \~"8, where @ is the arc tangent of 
\~"/? times estimate (3) and t,(n—2) is the ¢t(n—2)-distributed random variable 
exceeded in absolute value with probability y. Let ¢v and ¢, denote the upper 
and lower of these limits. We suggest that a rough estimate of the variance of 
estimate (3) is max ((tan ¢y—tan $)*/4A\-', (tan ¢,—tan ¢)?/4A-'), where 
7 = .95. 

So far, all I have pointed out is that we could not have solved the afore- 
mentioned equations for maximum likelihood estimates of 8 without additional 
information. Three questions arise: 

(1) Can we estimate these pdrameters in the normal case without using 
additional information by some method other than least squares or 
maximum likelihood? 

(2) If we cannot, what other kinds of information, besides knowledge of 
oy", o,”, or A, can be used to obtain an estimate of 8 in the normal case? 

(3) Do we also need additional information in applying the method of 
maximum likelihood in estimating 8 if we assume that the z’s and y’s 
are not normally distributed? 

As to question 1, the difficulty we had in estimating £@ is really rooted in the 
problem of identifiability. Briefly, a set of parameters is nonidentifiable if 
more than one set of parameters can give rise to the same distribution of the 
observed random variables. For example, in our case the parameters 


o,2 Cy? B a 


2 1 v— 4p 
1 
4 


3 v— ty 


lead to the same distribution of x and y, namely a normal distribution with 
Ex=y, Ey=v. ¢.2=0,?=1, and p(z, y) =1/2. Reiersol [37] has answered ques- 
tion 1 (cf. also [2, 32, 43]) by proving that if wu and v are each normally dis- 
tributed, then a and @ are nonidentifiable if and only if X and Y are constants 
or normally distributed. What this means is that one cannot estimate a and 8 
at all when the errors are normally distributed in any functional relation, in 
the regression situation, or in a structural relation where X and Y are normally 
distributed unless one has additional information sufficient to make these 
parameters identifiable. Without such information, there is no way of telling 
two sets of parameters apart by considering the distribution they define. In 
[24], Kiefer and Wolfowitz answer question 3 by proving that in identifiable 
cases, the maximum likelihood estimates of the regression parameters are in 
fact strongly consistent, i.e., with probability one they converge to the true 
parameters as n approaches infinity. 

The rest of this paper is divided into two parts. In the first (Sections 2-5), 
we will deal with the problem of what other methods of estimating 8 (and hence 
a by @=4—2) exist in the unidentifiable case using other kinds of information. 
In the second part (Sections 6-7), we wil! deal with various estimates of 8 in 
the identifiable case when the distributions of X and Y are unknown so that 
the method of maximum likelihood cannot be applied. Finally, an example of 
the use of these estimates will be given. 
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2. THE METHOD OF GROUPING 


A suggested method of solving the problem of estimating @ in the functional 
relation situation when both variables are observed with error is the method of 
grouping (sometimes called the method of group averages [31], and known by 
many other names, cf. [8, p. 137]). The method in simplest form consists of 
ordering the observed pairs (x;, y;) (in a manner to be described later), selecting 
proportions p; and p2 such that pi+p2<1, placing the first np; pairs in one 
group (G;) and the last npe pairs in another group (G3), discarding G2, the middle 
group of observations (if pi:+p.<1), and estimating 8 by 


pr yi — pr Dy 


b G; G3 
pro Dia — pr Day 
Gi G3 





Good graphical explanations of the intuitive rationale behind this estimate 
are given in [15] and [31]. The mathematical rationale behind the use of this 
estimate can be seen by examining the first two moments of 


bh = (n= y > 2;~- pr Da) /n 
Gi G3 

b = (o- DL yi — pr Du) /n 
G, G; 


Bi = (o- 2x, — pr Zax.) / 


and 


\ 
Bo = (p- p Y;—- pr te vi) /n = a(n » Xi a . e p x.) /n. 
G; G3 G1 G3 
Then 8=82/{;. If the grouping is independent of the errors, then 
I 
Var (by _ B1) = n? Var & ) he po > us| = Ou? (pi! + po')/n 
Gi G3 
and 


Var (be — Be) = o7(pi-! + pa) /n. 


We see that as n— ~~, b\—; and b.—£2 in probability. Then 6=b2/b: would be 
a consistent estimate if only (1) the grouping is independent of the errors, 
and (2) we can be sure that as n— ©, b; does not approach zero. Condition (2) 
is essentially Wald’s condition, [47], namely 
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lim inf ( > ax; - DY px) /n > 0. 
no |\ @, G3 

To determine when these conditions hold, we must examine possible pro- 
cedures by which one assigns each observation (z;, y;) to one of the groups. 
Clearly, if the observations were assigned to each of the groups at random, then 
E> 6,pr'X; would equal E >> ¢,p:-'X;, and so the second condition would not 
be satisfied, though the first would. On the other hand, if we knew the mag- 
nitude of the X,’s, ranked the (zx;, y;)’s by the magnitude of the corresponding 
X,’s, placed the first np; in Gi, the last np2 in G3, and discarded the rest, both 
conditions would be satisfied and we could estimate 6 (cf. [41] in this regard). 
But it is a rare occasion when we know the relative magnitude but not the 
actual value of each of the X,’s. 

We can, however, order the (z,, y;)’s by the magnitude of the z,’s quite easily. 
This method of ordering simulates the method of ordering by magnitude of the 
X,’s. But using this ordering procedure and basing the grouping on it does not 
guarantee the consistency of 6, since this method of grouping may not be inde- 
pendent of the errors. What would be of interest are necessary and sufficient 
conditions for 


pa yi — pr Dy 
G, 


G3 


“po Dap De 
G G3 


b 


to be a consistent estimate of 8, when the grouping is based on ordering the 
observations on the basis of the z,’s and using the first np; and the last npo. 
Neyman and Scott [34] have found a necessary and sufficient condition for 
the consistency of b in the structural relation situation. The condition follows. 
Let x»,, 2i-,, be the p, and (1—p,) percentile points of F(x), the distribution 
of x. If [u, v] is the shortest interval such that Pr{yp<u<v} =1, ie., if »—y is 
the range of u, then 6 is a consistent estimate of 8 if and only if 


Prjzp, —» < X <2,,-— yw} = Prim», —v< X <n», — nu} =0. 


Intuitively, b is a consistent estimate of 8 if and only if the range of X has 
“gaps” of “sufficient length” at “appropriate places” (determined by p; and pz) 
where X has probability zero of occurring. Only in this event can one be sure 
that as n—« the points which are misgrouped with respect to the X’s do not 
contribute to tending plim,..,, 6 away from 8. 

Practically speaking, what does this condition mean? It means first of all 
that we must know the range of the error in z and above all that this range be 
finite, for if the range were infinite in both directions, the condition becomes 
Pr {| — «0 <X<-+~}=0, which is never satisfied. In most practical situations, 
one doesn’t know the distribution of wu and usually relies on the central limit 
theorem and assumes that wu is normally distributed. Since the norma! dis- 
tribution has an infinite range, one sees from the above that this method can- 
not be used when the errors are normally distributed (as we knew also from 
Reiersol’s theorem). 
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Tesides a knowledge of the range of u, one must also know the p, and 
(1—pe2) per cent points of the distribution of z and the range in which X has 
probability zero of occurring. If, for instance, X is distributed continuously 
from — © to +, then again b is not a consistent estimate of 8. We see, then, 
that this method leads to consistency in very exceptional cases only. 

In the functional relation situation, we can specialize the Neyman-Scott 
condition and see that we need only know the range of u to use b as a con- 
sistent estimate of 8. In that case, if the range of wu is finite, we can use non- 
random sampling of our z’s in such a way that no z be observed in the intervals 
[z,.—v, 2»,—mu] and [x_,,—v, 21-»,—mu]. Our grouping procedure would then 
be: (i) order the (z,, y;)’s by magnitude of 2;; (ii) place (x, y,;) in Gi if z:<z,,, 
in Ge if xp,<2;<a1_»y,, and in G; if x;>-;_,,. In this case, b is a consistent esti- 
mate of 8. 

One need not know the range of u precisely to “estimate” 8. If one knows 
only that there exists a 6 such that Pr {| u| >} is negligible e.g., if wu is 
normally distributed, then 6=4¢, say), and if the number of z,’s in the inter- 
vals [z,,—6, xz»,+6] and [2:_,,—6, 21-»,+6] is also very small, then even 
though the grouping is strictly speaking not independent of the u,’s, there is 
a high probability that the grouping by the aforementioned procedure will in- 
sure that the order of the z,’s is the same as the order of the X,’s. As we saw 
earlier, this is precisely the situation that we want, and so we may be satisfied 
with this procedure even though the estimate obtained by this method is not 
consistent. 

In contrasting the Neyman-Scoti necessary and sufficient condition with 
Wald’s sufficient conditions for consistency, one should notice the following. If 
Wald’s condition (2) is violated, b diverges wildly; if the Neyman-Scott condi- 
tion is violated, b might still converge in probability to some limit, but not to 
8. Still, for small samples, b might be an “adequate” estimate of 8 even though 
the Neyman-Scott condition does not hold. The consistency property of b is 
not of great interest if one is dealing with a small sample, for in that case we 
cannot be sure that b is “close to” 6 at all. Only as n becomes very large are 
we more sure that 6 is within e of 8. 

Of greater interest is the relative efficiency of b for different values of p; and 
p:. Nair and Shrivastava [31] and Bartlett [4] have considered the case in 
which X is uniformly distributed and observed without error and have shown 
that if one compares b, the estimate based on the method of grouping, with 4, 
the estimate based on least squares, one finds that Var 6/Var b, the efficiency 
of b, is maximum when p;=p2=1/3. The efficiency of b in that case is 89%. 
In [42] Theil and van Yzeren have shown that if X, measured without error, 
has a Beta distribution, then the method of grouping is most efficient if p: =p» 
=.3. The efficiency depends on the parameters of the Beta distribution, is not 
appreciably affected by deviations from symmetry, and the maximal efficiency 
is of the order of .85. They also show that for the symmetric triangular dis- 
tribution of X, p:=p2=.28 is most efficient and if X has a normal distribution 
Pi = P2=.27 is most efficient. In [15], Gibson and Jowett give the following 
rules for grouping for various distributions of X and the corresponding effi- 
ciency. 
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Description S(z) Range of z Efficiency 











Normal (2n)-3 esa —o<r<w 
Rectangular 5 -1l<z<1l 
Bell-shaped #(1—2*) -1<z<1l 
U-shaped (4+24) -—1<z<1 
J-shaped e-?-2 —2<2z<0 
Skew 96z%e~2/2 0 <a< 




















Nair and Banerjee [30] collected evidence from model sampling in a func- 
tional relation situation which showed that when both variables are subject 
to error, the grouping method in which pi=p2=1/3 gave a more efficient 
estimate of 8 than the estimate based on p1=p2=1/2. Their model was one in 
which the X’s were one unit apart, and wu and v, the random errors added to the 
X’s and Y’s were normally distributed with o=.1. In other words, although 
their model did not satisfy the Neyman-Scott requirement for consistency, it 
did satisfy the approximate small sample condition given above for obtaining 
a grouping in which the rank of z; is the same as the rank of X;, since 1;< X;+6 
=Xj5+.4, tig > Xi1-56=X,;4+1—-—.4=X,4+.6, approximately. This lends some 
credence to the belief that in the case in which b is consistent, it is most efficient 
if based on p;=p2=1/3. However, in our example of Section 8, the estimate 
based on p:=p2=1/2 was more efficient than that based on p:=p.=1/3. 

To obtain an approximate estimate of the standard deviation of b, the follow- 
ing procedure may be used. Let 


} vj Z vj 
Ge G3 ‘ 


%3 = 


n(1 — pi — Pr) NP2 


2) 





Define j:, §2, and 93 similarly. Let 
(n — 2)S.2 = Do (ai — 1)? + DO (ei — He)? + DE (ei — 4)? 
G; G: G3 


+ 8(%1 + #3 — 2%)*(pi-! + po! + 4(1 — pr — po)-'/n. 
(n — 2)S,? u (yi — ji)? + u (yi — 52)? + u (yi — 3s)? 
+ 8(51 + Fs — 2H2)*(pi-? + pa? + 4(1 — pi — pa)“")/n, 
and 
(n — 2)Sxy 
= D(a — (yi — fh) + DL (ei — (Yi — He) + u (xi — #3) (ys — 5;) 


Gy; Ge 
+ 5(%: + #3 — 2%2)(H1 + Fs — 2Hs)(pr-? + po + 4(1 — pi — po))/n, 


where 


a {" if pi + pe = 1 
1 otherwise. 
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Let to? =t.957(n—2), the square of the 4(n—2) distributed random variable ex- 
ceeded in absolute value with probability .05, and c=n(p:-'+p2~)io*. Define 


[n*bibs — cS.y] + {c2(Sey? — Se2Sy2) + en2(bi2Sy? + ba2S.2 — 2ibeSay) } 12 
n*b,? — cS,? 
Then, since b* is an upper 95% confidence limit on b (cf. [4]*), (b*—b)/to 
gives a rough estimate of the standard deviation of b. 
A variant of the method of grouping which gets by with a little less informa- 
tion is the following. Fix two numbers r and s such that r<s and Pr {x<r}>0, 
Pr {x>s} >0. Divide the observations into groups corresponding to whether 


ai<r, r<2;<s8, or x;>8, and let the number of observations in G; be n;. For 
this procedure, Neyman and Scott have shown that 


mo Diy — ms Dy 
G; G3 

ny! : vi= n3~ > Fe 
G; G3 





b* = 


b= 





is a consistent estimate of 8 if and only if 
Pr{r—v<X<r-—yp} =Pr{s—v<X<s—yu} =0, 


where u—yv is the range of u. In this case, we do not need any information on 
specific points of the distribution of z, as we did before. However, the more 
restrictive conditions remain with us, and we see that this variant of the group- 
ing procedure leads to consistent estimates in the same exceptional cases as in 
the previous grouping method. 

By far the most important, useful, and valid application of the method of 
grouping arises when one has some extra-data grouping criterion which is cor- 
related with X but not with u. For example, one might be interested in the 
linear relationship between yield strength and hardness of steel, where each of 
these variables is subject to errors of observation. If one knew that he had two 
groups of pieces of steel, each group forged at a different heat, then this bit of 
information would be a good grouping criterion. This is quite different from 
the aforementioned methods of grouping, in that here we need not worry about 
such problems as the relation of the order of the z,’s to the order of the X,’s. 
As long as this grouping criterion is correlated with X but not with u, Wald’s 
conditions (1) and (2) are met and 6} is a consistent estimate of £. 

This review of the idea of grouping is by no means complete. What I have 
so far discussed in detail is the outgrowth of a particular estimate due to Wald. 
Other estimates based on the idea of grouping will be discussed later, after we 
discuss the use of components of variance in regression analysis. 


3. USE OF INSTRUMENTAL VARIABLES 


Another method of obtaining consistent estimates of 8 which has received 
extensive consideration is a method based on the use of what is sometimes 
called “instrumental sets of variables.” These instrumental sets of variables 





2 Bartlett considers only the case where m = ps. His k =nm =nps in our notation. The first equation [4, p. 210} 
contains an error, in that the term 4/(mn —k) should read 4/(n —2k). 
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are merely an additional set of variables closely related to X and Y, the “in- 
vestigational set” of variables. For example, if we were interested in the linear 
relation between the quantity of butter available and the price of butter, pos- 
sible instrumental sets of some interest would be the quantity of margarine 
available, the price of margarine, or both. Use of this instrumental set of 
variables is just another example of the different kinds of additional informa- 
tion useful in estimating 8. 

The major difficulty in using this approach is to find such a variable which is 
independent of the u’s and correlated with the X’s. If we are not really inter- 
ested in observing the instrumental variable for its own sake, we are confronted 
with the problem of the additional cost of obtaining information on the instru- 
mental set. 'f the instrumental set is highly correlated with the investigational 
set, we are faced with the question, “Shouldn’t the instrumental set also be in- 
cluded in the relation of interest?” For example, in the price-quantity of butter 
example, perhaps our relation should be Pgs =a+6Qs+~7Pwm where Pz, Qz, and 
Py are the price of butter, quantity of butter, and price of margarine, respec- 
tively. 

We shall discuss two types of instrumental variables useful in facilitating the 
estimation of 8. The first set of instrumental variables is a set of observations 
on two linearly related variables where the parameters of the linear relation 
are known, and where the variables are observed with error. The second set of 
instrumental variables is a set of error-free observations on one random vari- 
able. In this latter case, we shall give one estimate of 8 in this section and defer 
another estimate of 8 to Section 4, as it can best be understood in the context of 
that section. 


3.1. Two Linearly Related Instrumental Variables Observed with Error 


Reiersol [36] considers the case in which one knows the constants y: and 72 
of the relation y:Z:+72Z2=0 between two sets of instrumental variables, Z: 
and Z:. For convenience, let us change notation for the moment. Let X =X, 
Y =X2, and rewrite the structural relation Y=8X as 6:Xi:+6.X2=0 where 
B=— 6/82 and EX,=EX,=0.' Also, let u=m, v=u2, =X, and y=2. Con- 
sider the situation in which we observe the instrumental set with error, i.e., 
2;=Z;+w;, 7=1, 2. Our observations are the quadruples (21;, 22:, 21:, 22), 
i=1,---,m. Assume, as usual, that the errors in all our variables are uncor- 
related with the true values. Define \;=E(ujw,)=E(uz,)=E(aw,), Diy 
=E(X,2z;), and ywii;=E(a2z2;). Then 6:X:+6.X,.=0 implies that 6:2 (2;X;) 
+82.H(z;X2) =0, or Bifi;+8e2fe;=0. If we define B=(, 62) and 


. Au fre 

eats rae 
Miz = fen 

then Ba=0. But 7=u—X where 

Miu Mie 

| ¥., ’ 
M21 22 

8 If we define X;’ = X; —X and Yi’, = Yi —Y, we see that X;’ =6Y;’, or 6: Xi‘ +62Y;’ =0 for all i, where 8 = —fifs 


and EX;' =EY;’ =0. Since we know how to estimate a once @ is estimated, we can, without loss of generality, con- 
sider the above homogeneous linear relation. 
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and 


¢ °) 
A= , 
0 de 


so that B(u—A)=0. It is also evident that Zayit@wy2=0. If we define T 
=(y, 2), then al’ =(u—A)TI’ =0. But 


w—-r' = C + youn — ) _ () 
Youn + Y2u22 — Y2Ae 0 
implies that \;=(uI’);/y:, where (uI’); is the 7-th element of the vector pT’. 
’ 


One sees that 
(" (uiev1) /Y2 M2 ) 
a Xr = ? 
Mei — (uery1)/Y2 


and hence B(u—dA)=0 implies that Boyn=(Bimey2)/y and so B=—;/B2 
= — (Yiu) /(y2u12). Since we know 7; and 2, and since 


n n 
My = ( Yaz) /n and m2 = ( Danen) /n 
i=l i=1 


are consistent estimates of wi. and ya, respectively, 6 is estimated by 
— (yim) /(v2/mi2), provided that Coy (z, +) 0. 


3.2. One Instrumental Variable Observed without Error 


The great difficulty with the above method is that we not only need a pair 
of instrumental variables, but also a knowledge of the linear relation between 
them. Let us consider the case in which we have observations on only one in- 
strumental variable, say Z, and where, in contrast to the Reiersol situation, 
Z is observed without error. Consider once again the relation 6,X;+6.Y;=0, 
where 6 = —6;/2, multiply it by Z,/n and sum over all 7, to obtain the ex- 
pression 


(«, 7 ZX; + Bo p> zy.) [n = 0. 


i=1 i=1 


Call the left hand side 8:nx+eny. Consider the same expression with y and x 
substituted for Y and X, and call the left hand side 61n,+82n,. Then 


~~ ao (x za) [n and te" ta ( Dzau) /n. 
i=l jaa 


E(nz— x) =E(m,—ny)=0 and Var (n:— x) =0(1/n), Var (my— ny) =O0(1/n). 
Hence n,— ny and n.— nx converge to zero in probability so that 6:7. +82n, =0 
is a consistent estimate of the true relation, and 


> Zy: 
t=1 


5b = 


> Ziyi 


t=1 





188 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1959 
is an estimate of 8 = — 1/82, provided that 

n 

Do Zi 

t=1 


doesn’t approach zero as n>, i.e., Cov (Z, xz) 0. 
Besides obtaining this estimate of 8, Geary [14] also derives the exact dis- 
tribution of a function of b, when X, Y, and Z are normally distributed. He 


o(b)db = (1 + y*)-*/*dy 


(“)v 


- jane — 2uub + un) ¥ i} “ws 
(u2sb — pis)? 

where EX=EY=EZ=0 and pu=EY?, un=EX?, us=EZ*, w2=EXY, ws 

=EYZ, and yw23=EXZ. We see, then, that y is distributed as (n—1)—/*t(n—1), 

where ¢(n—1) is a random variable with a ¢-distribution with n—1 degrees of 


freedom. 
We can obtain the approximate variance of b as follows. Write 


— fuss (u226? — 2ui28 + wn) _ ib 
(u238 — ys)” 














{ 2y23(uex8 — p13) 
Hs3(u228* — 2u28 + ui) — (u238 — mis)* 
he 2(u238 — wis)? | (ueouss im: bes)*B E (ur2uss sis bsies) | 
[uss (u228? — 2u8 + wn) — (u2s8 — mis)*]? 
= A + Bib — 8), 





bo-2) 


say, so that Var b~Var W?/B?=2(n—2)/(n—3)?(n—5)B?, n>5, since W? is 
distributed as (n—1)—'F(1, n—1). 

It is worth noting that if Z; can take values 1, —1, and 0 depending on i but 
independent of the u’s, then this estimate results in the grouping estimate. This 
is, of course, a quantification of the application of the method of grouping 
when the extra-data grouping criterion is correlated with X but not with uw. 
Durbin [10] suggests that if the order of the z’s is the same as the order of the 
X’s, then a better instrumental variable would be Z;=i where the z,’s are 
ordered by magnitude. This variable will lead to a more efficient estimate than 
that of the method of grouping. Finally, the procedure of Section 6 may be con- 
strued as an example of estimation by means of instrumental variables, where 
the instrumental variable is just some power of z. 
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The same estimate has been obtained in an unpublished paper of Tukey by 
the following consideration. Cov (Z, y—bx)=Cov (Z, a+8X+v—bX —bu) 
= (8—b) Cov (Z, X) is zero if and only if b=8. Thus, we can estimate 8 by the 
value of b for which the slope of the regression of Z on y—bz is zero. But if one 
considers the sum of squares of deviations due to regression of Z on y—bz, one 
sees that 


, > Ziyi 


t=1 


> 'Z 2; 
j-1 


b= 


makes this quantity zero. The idea is the same as that of Geary, who also form- 
mally considers the sample covariance of Z with y—bz to obtain his estimate. 
However, Tukey’s method of motivating this estimate by considering the 
analysis of variance in regression technique is different. In our next section, we 
shall see what more can be obtained from considering linear relations with 
errors in both variables in the light of the analysis of variance. 


4. USE OF VARIANCE COMPONENTS 
4.1. Replication of Observations 


Let us now consider the situation in which we have another kind of addi- 
tional information, namely, where we know that we have N, observations 2;; 
on each of nX,’s. If we have the situation in which y,;; = Y;+0,; and 24; = X;+u,;, 
and if the usual assumptions of independence are made, then one can perform 
a one-criterion analysis of variance on the 2’s and the y’s, and from this obtain 
an estimate of 8. The simplest way of describing the procedure is to exhibit the 
anova table. 








Mean Square Expected Mean Square 





Y Ni(a.—2..)*/(n-1) ot+ [ (- > Net) / (nN —N) | ox* 


tal tal 


+ Nite —8..Ma~d..Jie~t Cov(u,»)+| (wt > ve) / wx-n)| Box? 


tol i=l 


Between Sets 


i= tol 


n 


Ni 
DL (ej —4:-)2/(N —n) ow 


tml jal 


S Ni(G.—9.-)2/(n-1) ott | (Nt ¥ Ne) / (nN—N) |ptox* 
2 


n Ni 
DX } (eijs—4:-) (Yis—5:-)/(N —n) | Cov(u, v) 


inl jai 


Within Sets 


n Ni 
DL DX (yii—5i-)2/(N —n) 


tml j=l 
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n 


Ni N; 
C tulN Ji- bi p D yii/Ni, co = z > ri3/N, 


Sud i=l j=l 


yii3/N, and N = >> Nx. 


t=1 j=l 


One sees, then, that (II—V)/(I—IV), (III—V1I)/(II—V), and 


a 
I-IV 


converge in probability to 8 as n—« and N;—~ for some 7. These estimates, 
along with similar estimates in other situations (to be described later), are due 
to Tukey [45]. (One should note that though we write cx? in the anova table, 
we do not mean to imply that this procedure is only applicable to structural 
relations. For functional relations, read 


| (" as Da) / (nv ua wv) Jos? as | > NX; - x) / om me 
\ i=l i=1 


where 


X = )N.X,/N. 
i=1 
Another estimate in this situation is one given by Housner and Brennan 
[18]. They argue as follows. Consider the expression bj jx1= (yi; — yer) /(tij — Te), 
rigAt. Since yy=atPrijytviz—Buij, (Lig—Ter)bijer=B( Lig — Ter) + (Vij — Ve) 
—B(uiz— Ux), so that 
Yiz — Yea (Vij - _ Vet) - ~ Bluis — 2 ~ Mit) 


Liz — Lk Lig — Ziel 


for all 2, 7, k, l, 4; 241. Summing over all combinations of points (where 1#k) 
and ignoring the term involving the errors (since it converges to zero in prob- 
ability), they obtain the estimate 


Zaz 


Sa re S: N;-2 DEN N;+N y) 
i=l 
This estimate approaches 8 in probability as N;—~ for at least two distinct 
values of 7. (Since the optimum efficiency of this estimate may depend on the 
omission of some of the combinations of points from consideration, this esti- 
mate has been considered as a variant of the method of grouping.) 

We have here two approaches by which to obtain consistent estimates of 8 
when we know that we have replications of observations on each X;.4 When 
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should these estimates be used? First of all, if we really believe the relation is 
linear, the optimum allocation of observations would be at two points. Hence, 
since the estimate based on variance components approaches 8 in probability 
only as both n—o and at least one N;—, the Housner-Brennan estimate, 
which approaches 8 in probability as N,— © for at least two distinct values of 7 
(i.e., whose consistency is independent of n), is to be preferred. If we do not 
believe that the underlying structure is linear, but are only trying to ap- 
proximate some function in a small area of its range by a linear function, it 
may be more advisable to increase n at the expense of decreasing N; to as little 
as 2. In this case, the Tukey components in regression estimate is the better. 


4.2. The Method of Grouping 


The same anova table can be looked at from the point of view of grouping: 
If our observations are divided into r groups in some manner independent of 
the errors in observation of z, after changing n to r, N; to n;, and N to n in the 
above table, we can interpret the “between” mean square as a mean square 
between groups and the “within” mean square as a mean square within groups. 
We still have the same three estimates of 8, namely b:=(II—V)/(I—IV), 


b.=(III—VI)/(II—V), and 
Tl — VI 
5 it 


In particular, if r=2, we have in b,, be, and bs competitors of any grouping esti- 
mate of Section 3 using all the data. However, to use these estimates one need 
only assume that the grouping is independent of the errors in z, and not all that 
the Neyman-Scott result asks us to assume, so that for all practical purposes, 
the only grouping estimate in competition with },, be, and 6; is the one in which 
one groups the data by means of some criterion correlated with X but not with 
u. No one has yet compared the asymptotic variances of these two types of 
estimates. Use of Tukey’s estimate of the asymptotic variances of b;, be, and b; 
(ef. Section 4.3) in the example of Section 8 shows that each of these estimates 
has smaller variance than the grouping estimate when p:=p2=1/2. However, 
Tukey’s estimator of these asymptotic variances is very poor when r=2, and 
so one should not generalize too hastily from this observation. 


4.3. Use of Instrumental Variables 


Finally, we can look at the situation in which we observe an instrumental 
variable Z without error from the components in regression point of view. 
Consider the following anova table. 





‘In this case, also, the solution of the maximum likelihood equation for 6 has not been expressed in closed 
form. Analogous to the results of Seetion 1.2, we see that 


VI VI 2 VI 
(1 -— t) as \(—1 -ut) +4-— ans} 
IV IV iv 


211 





is also a consistent estimate of 8. 
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Source Mean Square 





Ye-a@-Z] / ¥ e-B 
¥ @-2@.-2)|[ E w-na-H] / E @-o 


tml 


[ Lu-na-] / & am 


tol 


> (ex—2)*—[ > (w—2)(2s—Z) | / > (2-2) / (n -2) 


tl 


)* (es—2)(u.-9) - | > (u-w(Zi-%) | 


Regression on Z 


¥ @-(z-2)] / ¥ ¥ z-Z)t / (n—2) 


tol 


> w-m-[ Ew NZ) | / 9 (2-2) / (n—2) 

















Source Expected Mean Square 





o2? +B? )) (Z;—Z)? 
Regression ae 


on Cov(z, y) +6B* >> (Z;—Z)* 
Z t=) 





oy? +8°B* 2) (Z;—Z)? 
oz? t=! 


Balance r | Covi(z, y) 





o,? 





Here B is the slope of the regression of x on Z. We see then that 


(I - wv) /( Xe. 7" Z)*) 
(II — v/(x«@ a 2)*) 


is an estimate of 6B?, and 


(III — vy /( 2 wi - Z)*) 


is an estimate of 6°B?. Hence (II—V)/(I—IV), (III—VI)/(II—V), and 


j= — VI 
I — IV 


is an estimate of B?, 
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are all estimates of 8, provided that the denominators of these estimates do not 
approach zero as n>, i.e., B¥0. 

In all three situations, replications, grouping, and use of an instrumental 
variable, we have, from this approach, three different estimates of 8. Let us 
examine the variance of the two estimates easiest to compute, b; and be, to 
determine when each of them should be used. I shall use the approximate rela- 
tions (II—V)~s(I—IV) and (III—VI)~6?(I—IV) in this comparison. Using 
28.4 of [7], we see that 


cuit ae ance 23 cov (I — V,I—IV 
(1) (=) ote + Bo ota — ae ov (IT — V, I — IV) 


- ie - ™ Bs Bk of 
2 ra a se a v- OV — a = 
"Nir Vv) pe? I aga EY aps | 


1 wi eoae 
( )-(2) —_ ke I-IV ~ Bk? 


2 
oiI-vI — = (Cov (II — V,I — IV — III + VI)) 


p? 1 28 . , 
~ pow soa Bk? o7ntvI — oe (1 — 6*)(Cov(II — V, I — IV)) 
B2 1 28(1 — 6) 
=~ ke ory — Bk? onI-vI — — ke? — Bo*;_1y 
B* — 26 + 268 1 
= i i = Bk? o*111-v1 
— 6? + 264 6 
~— o*1_Iv — R o11y >0 


k? 
Sp? >1 





where k is either 
B? S) (Z; — Z)? 
t=1 


(in the instrumental variable case), 


| (w - De) / ww - wv) Jox? 


(in the replication situation), or 


(eB a) /on- oe 


(in the grouping situation). We see, then, that b. is a better estimate of 8 than 
b; if we know that | 8| >1, and that b; is better if |8| <1. An idea as to whether 
or not || >1 can be obtained from plotting the observations. 

In [45] Tukey gives the approximate variance of b; and b». Since b; is the 
geometric mean of 6; and be, the approximate variance of b; can be determined 
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by 28.4 of [7]. We shall give the approximate variance of b,; the changes which 
must be made to obtain Var be are obvious. 
I(111 — 211b; + Ib;?) = IV(VI — 2Vb; + IVb,’) 
df(I) df(IV) 
(I — IV)? 








Var b} 


where 
in case of replications 
in case of grouping 


in case of instrumental variables 


(N —n_ in case of replications 
df(IV) n—r_ in case of grouping 


n—2 in ease of instrumental variables. 


This approxination is good for large df(1). Hence it will not be too good in the 
usual cases of grouping, where r is very small, or when one has an instrumental 
variable. For this reason, the estimated variances via this method are not given 
in Table 200. 


5. THE BERKSON MODEL 


So far, we have considered the case in which either (x, y) was a random pair 
or the special case where (z, y) was chosen so that the grouping method was 
(at least approximately) applicable. There is, however, another model, due to 
Berkson, [5], (ef. also [22] and [28]), wherein instead of trying to observe a 
given X; but actually observing z;= X;+4;, we fix our z,’s and observe y; for 
each fixed x;. This process of fixing one’s z,’s can be done quite easily in the 
laboratory sciences, where, for example, if one wished to estimate an Ohm’s 
law constant (e.g., a resistance), one could fix the x;’s by setting the dial of the 
ammeter (presumably the cause of the errors in observing X ;) at predetermined 
settings. Then, for each fixed z (e.g., for each fixed current reading), there are a 
number of X’s which could have given rise to the particular x which is ob- 
served; also for each X there is a probability that the observed fixed x is an 
observation on that X with error u. The X’s are now random variables dis- 
tributed about the fixed x with error u, i.e., X =x+u where u is independent of 
x (and not of X) 

We now observe y=a+fx+8u-+v and here both fu and v are independent 
of x. Hence we have the situation in which our relation is y=a+fx+w where 
w=Bu-+v and w is independent of x. If we assume Eu = Ev =0, then Ey=a+ 8x. 
z is now a fixed number rather than a random variable, and we know that for 
this situation the least squares estimate of 8 is 
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n 


DX (2: — a)(ys — 5) 


b= 





n 


Ld (a: — #)* 


i=1 


This technique works not only in the structure-function situation, but also in 
the regression situation, for the distinction in errors in y makes no difference 
in this case. 

The real significance of the Berkson model lies not in the estimator of 8 but 
rather in the implications of the model to the design of experiments intended 
to yield data from which 8 may be estimated. In contrast to the problems in 
estimating 8 described above when (z, y) is a random pair, we see that if our 
physicist can fix his 2’s, the statistician has no problem in estimating the linear 
relation of interest. 


6. ESTIMATION VIA CUMULANTS 


Consider the homogeneous linear relation 6:X+8.Y =0 where B= —,/f: 
and X and Y are random variables whose expectation is zero. That is, let us 
consider a structural relation where the intercept is zero. Geary [12, 13] 
noticed that since Y=8X, the bivariate cumulant of X and Y of order c, 
co+1, namely «(ci, c2+1), was equal to Bx(ei +1, cz). Also, if 1, c2>0, K(e1, ec), 
the cumulant of the distribution of the z’s and y’s, is equal to x(c, c2). This is 
evident from the following properties of bivariate cumulants (cf. Kendall 
[23], Kaplan [20], and Lindley [27]): (a) the cumulant of a sum of independ- 
ent random variables is the sum of the cumulants of the variables, and (b) the 
bivariate cumulant of any order c:+c: (where both ¢:, c.>0) of independent 
random variables is zero. Since Cov(u, X)=Cov(v, Y)=Cov(u, v)=0, 
K(e1, €2) =x(e1, ¢2)+[(c1, ¢2)-th cumulant of X and u]+[(e1, c2)-th cumulant of 
Y and vj+{[(c, c2)-th cumulant of uw and v]=x(e, c)+0+0+0=x(c, ¢2). 
Hence, since k(ci, c2), the sample k-statistic, is an unbiased consistent estimate 
of K(c1, c2), we see that 


5 k(c, C2 + 1) 


~ k(er + 1, ¢2) 


is a consistent estimate of 8 if k(e:+1, c2) does not approach zero. 

This estimate has not used any additional information and hence by the 
identifiability result we know that in the case of normality it must fail. This is 
certainly correct, for in the normal distribution, all cumulants of degree three 
or higher are zero, and since ¢,>0, c2>0, and cumulants of order c;+c.+1 are 
used, we cannot estimate 6 by this method in the normal case. 

Another problem in using this method is that of what order cumulants to use, 
for this method provides us, in the, non-normal case, with an infinity of esti- 
mates based on the different orders of the cumulants used. Geary suggests that 
cumulants of lowest order be used because of ease of computation. Even so, 
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one should know something about the shape of the joint distribution of z and 
y, for all odd cumulants of a symmetric distribution are zero, and hence cannot 
be used to estimate £. 

A better measure of what order cumulants to be used is the variance of the 
estimates based on different orders. To be on the safe side, fourth order cumu- 
lants should be used to estimate 8 if nothing is known about the joint distribu- 
tion of x and y, since the distribution may be symmetric and then use of third 
order cumulants will not yield an estimate of 8. However, one should note that 
inaccuracy in estimation of cumulants generally increases rapidly with order, 
and, coupled with the fact that this general method of estimation breaks down 
in the normal case, one is not very likely to use it to estimate 8. 

One method of improving estimation via cumulants is to pool estimates of 6 
based on different values of c:, co. The linear combination of estimates A(c:, cz) 
and 8(c,*, c:*) which has minimum variance is the combination af(¢c, c2) 
+(1—a)B(c;*, e*), where 
_ VB(aa*, e2*)) — Cov (B(c, e), B(ea*, c2*)) 

V(B(c1, c2) — B(ear*, c2*)) 
To facilitate this averaging, the asymptotic variance of A(c:, c2) and the asymp- 
totic covariance of A(c:, c2) and B(c:*, c:*) should be determined. But, using the 
multivariate extension of 28.4 of [7], 

eae + | i ion co +1))  V(k(c: + 1, ¢2)) 

k(a+1,¢)J x(c1, C2 + 1) x2(c, + 1, c2) 
2C (k(c1, ce + 1), kK(ear + 1, “| 
K(C1, C2 + 1)x(e1 + 1, e2) 





a 








and 
k(e1,¢2 +1) k(cr*, c2* +1) 
bie + 1,0) k(ea* + ra 
e: | Co +1), k(er*,c2* +1)) C(k(er + 1, 2), k(ex* + 1, c2*)) 
K(Ci, C2 + 1)K(er*, c2* + 1) k(e1 + 1, ¢2)x(c.* + 1, e2*) 
C(k(c1, c2 + 1), k(ex* + 1, ¢2*))  C(k(e, + 1, c2), k(ex*, @2* + 1)) 
7 K(¢1, C2 + 1)x(cr.* + 1, c2*) ty x(c, + 1, c2)x(c1*, c2* + 1) | 


and each of these variances and covariances can be determined using the pro- 
cedure of [20]. 

Estimation of 8 via moments rather than cumulants has also been considered. 
Since the ideas behind and the problems besetting this method are similar to 
those presented above, I shall only refer the reader to [9] and [38] for a fuller 
discussion of these estimates. 











7. ESTIMATION IN IDENTIFIABLE CASES 


An entirely different approach to the problem of estimating 6 in the linear 
structural relation is taken by Neyman [33]. He first rewrites the equation 
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Y =a+8xX in polar coordinates as Xcos 6*+ Y sin 6* =P where 6* =tan— 6 and 
—2/2<0*< 2/2. Neyman then finds a consistent estimate of 


{ if 0* = x/2 


é* otherwise 


when the following conditions are true: (i) X and Y follow an arbitrary non- 
normal distribution, and (ii) u=u’+u” and v=v’+v” where (a) wu’, v’ are inde- 
pendent, (b) wu’, v’ are independent of X and Y, (c) wu’ and v’ are arbitrarily 
distributed, (d) (u’’, v’’) is distributed in an arbitrary bivariate normal distribu- 
tion, and (e) (w’’, v’’) are independent of X, Y, u’, and v’. Condition (i) is due 
to the identifiability result of Reiersol and the fact that Neyman uses no addi- 
tional information in arriving at his estimate. 

Since a good presentation of Neyman’s estimate is given in [33] and the 
procedure is lengthy and complex, the reader is referred to [33] for a detailed 
exposition of this estimate. This estimate is motivated by and based on the 
properties of non-normal characteristic functions, i.e., characteristic functions 
of distributions not satisfying the Reiersol necessary and sufficient condition 
for identifiability. 

There are quite a few problems to be faced in applying this method to prac- 
tical situations. Aside from a lack of criteria on how to make certain choices 
before commencing to use Neyman’s procedure, there is the possibility that one 
may end up estimating @=0, in which case an estimate for 6* (and hence 8) is 
indeterminate. 

Wolfowitz ([49, 50, 51, 52]) employs another technique, the minimum dis- 
tance method, to estimate a and 8 for both structural and functional relations 
when the conditions for identifiability are met. The only difference between 
Wolfowitz’s assumptions and those of Neyman above are that u and v cannot 
be expressed as the sum of two components, one normally and the other 
arbitrarily distributed. Rather, Wolfowitz assumes that u and v are jointly 
normally distributed with zero means and independent of X and Y. On the 
other hand, Wolfowitz estimates 8 for all values of 8, in contrast to Neyman’s 
estimate of 8 =tan 6* only for 6* 40. The fundamental idea behind the estimate 
is the following. The estimates of a and 8, & and §, say, are chosen so that the 
empiric distribution function of the observations and the true distribution 
function of the random variables { (2;, y;), i=1, - - - ,n} when @ and 8 replace 
a and @ in this distribution function are “closest” in a sense defined in these 
papers. 

For the functional relation, intuitively speaking, if (I), the X,’s do not go off 
to infinity “too fast,” and (II), the “empiric distribution” of the (non-random) 
X’s does not get “too close” to some normal distribution with zero mean, then 
a and 8 are strongly consistent estimates of a and 8. In [49], Wolfowitz gives 
sufficient conditions for I and II to be satisfied which are not difficult to meet, 
so that conditions I and II are usually satisfied. In particular, if the X,’s are 
random variables with some non-normal distribution, then conditions I and 
II are met with probability one and so the result for the structural relation is a 
special case of the result for the functional relation. 
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8. AN EXAMPLE 


Let us now consider a case in which an estimate of the parameter 6 in the 
linear relation between two variables observed with error is desired. We are 
concerned with estimating the linear relationship between the Brinell hardness 
and the yield strength of artillery shells. Shells were manufactured from two 
different heats of steel, a random sample of 25 shells manufactured from each 
of the two heats of steel was taken, and the Brinell hardness and yield strength 
of the 50 shells were measured. Brinell hardness was measured by making a 
dent in each shell with a device connected to a dial from which the “hardness” 
in appropriate units is read. Besides variation in the force applied in making 
the dent, we have errors in this measurement from two other sources, in- 
homogeneity of the steel shell with respect to hardness and error in reading of 
the Brinell dial. Yield strength was measured by taking a piece of steel of 
specified length, width, and breadth, pulling it from two sides with constant 
pressure for a given period of time, and converting the new dimension into a 
measure of yield strength. Errors in this measurement are due again to in- 
homogeneity of the steel and to other errors of mensuration. The data appear 
in Table 198. 


TABLE 198 








Low Heat High Heat 





Yield Strength Brinell Hardness Yield Strength Brinell Hardness 








z Zz 





285 
285 
285 
285 
285 
285 
285 
285 
285 
285 
285 
285 
285 
285 
285 
293 
293 
293 
293 


302 
302 
302 
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We have been told that o,” and oc,” are approximately 50 and 7500, respec- 
tively. Using this information, we obtain the four least squares estimates of 
Section 1, 


(1) b = 3.536 
(2) b = 2.738 
(3) b = 3.475 
(4) b = 3.112. 


To use some of the other estimation techniques, we must make some inter- 
pretations about our underlying model. One interpretation is that each piece 
of steel is really different and has its own true Brinell hardness and its own true 
yield strength. We can immediately apply the method of grouping to this case, 
as we have a natural extra-data grouping criterion here, the heat at which the 
various pieces of steel were tempered. (Actually, since ¢,? is 50 and since the 
25th largest x; was 241 and the 26th was 277, about 40 apart, the approximate 
grouping procedure applied here as well.) The Wald estimate based on all 50 
observations, 25 in each group, was b=3.204, and, after discarding the middle 
1/3 of the observations and using 17 observations per group, b=4.256. The 
Tukey procedure yields the following mean squares: 


I 33,385.28 
II 106 ,977.6 
Ill 342,792.0 


so that (II—V)/(I—IV) =3.20, (III— VI) /(II— V) =3.142, 


s/~ — VI 
—_—— = 3.172, 
I — FV 


and the estimate of 8 given by the estimator in the footnote of Section 4.1 is 
3.427. 

If we interpret the situation so that for each piece of steel manufactured 
at the same heat there is the same “true” Brinell hardness and yield strength, 
we see that we have 25 replications on each of two values of X. As we saw be- 
fore, the previous anova table for grouping can also be considered in a new 
light and interpreted for replications, and so we obtain the same estimation by 
the components in regression technique when adopting this point of view. Upon 
applying the Housner and Brennan estimate to this case, we find that our 
estimate of 8 is 3.204, exactly the same estimate as Wald’s. This is so in this 
case because there are only two X’s (corresponding to two groups), and the 
Housner-Brennan estimate eliminates no observations in its summation over all 
sample points. 

It is of interest to see how these estimates compare with some of the others 
mentioned above. The least squares regression of y on x (which would be the 
Berkson estimate if this model were Berksonian) yields b =3.288. Berkson [5] 
shows that 
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2 — 
Var b = pli tl 


no,” 


where o,-? is the variance of u when z is measured as a controlled observation. 
If we assume that o,-*=¢,?, then an estimate of the standard deviation of b is 
A7. Using third order cumulants, Geary’s estimate of 8 is 5.66; using fourth 
order cumulants we find that ke/ks:=3.326 and ki3/k2=4.538. If we use 
Z; =i as our instrumental variable, with the z,’s ordered by magnitude, we find 
that b=3.462. Finally, in computing the Neyman estimate for a particular 
choice of characteristic functions used in his procedure, the estimate of 6 was 0, 
and the estimate of 8 was indeterminate. 

To summarize, following are the estimates of the slope of the linear relation 
and, where computed, estimates of the standard deviation of the estimated 


slope. 
TABLE 200 








Method Estimate of 8 Standard Deviation of Estimate 





. 288 
-536 
.738 
475 
112 
.204 
. 256 
.969 
.202 
.142 
.172 
-427 
-660 
.538 


Least squares y on z 

Least squares knowing o,? 

Least squares knowing o,* 

Least squares knowing \ =<¢,?/o,? 
Least squares knowing o,? and o,? 
Grouping, pi=p2=1/2 
Grouping, p1 = p2=17/50 
Instrumental variable Z; =i 
Components of variance 1 
Components of variance 2 
Components of variance 3 
Components of variance 4 
Cumulants—3rd order 
Cumulants—4th order (1) 
Cumulants—4th order (2) 


roWwWwwwwrwwwh ww 








w 
wo 
to 
ao 





In this particular case, another consistent estimate of 8, namely 7/#=3.434, 
should also be considered, for we can assume that when X =0, Y =0, so that 
a=. The estimated asymptotic standard deviation of 9/2 is .04, and so this is 
a good estimate of 8. It is this peculiarity of the example which lessens its value. 
Nevertheless, the differences between the various estimates presented in this 
paper are illustrated by the example considered. 


9. APPENDIX 


Lindley gives 


Yys— 9? - 2D w-2) 


7 


(1) p= 





2D (zi — #) (ys — 9) 


t=1 
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/ Fw —'-rAD Gs! 
= 





/ 23 @— Hy - 5) 


as his estimate of 8 when \=<¢,?/c,” is known. If we let 





F-9*-r~ Se 2 
t=1 tl 


"= ’ 


23 @- Oy: - 9 


t=1 





we rewrite (1) as 
B=yt+ V7 +%, 


which can never be less than zero, and hence is an incorrect estimate of 8. He 
arrives at this estimate by solving the following quadratic equation (where for 
convenience the z’s and y’s are measured from their means) : 


a( > ca) +8 > (Az,;? — y;?) — (x zw.) = 0. 


t=1 t=1 t=1 


We obtain 





Sura gars y/(Emr—w) +0(Se0) 


t=1 i=l t=1 t==1 





2 >» LiYi 


t=1 


By algebraic manipulation, (2) can be put into form (1), except for a possible 
minus sign before the radical. Lindley then asserts that using the positive sign 
yields a maximum likelihood estimate, whereas using the negative sign yields 
a minimum likelihood estimate. In form (2), this is true; in form (1) this is 
clearly not true, by the above argument. 

That the plus sign is correct in (2) can easily be seen by considering the equa- 
tion 


pa LiYi 


t=1 


A(z, y) - ’ 


( > x? > ') 


i=1 tel 
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we see that 
n 
signum [A(z, y) = sgn ( > rai). 
i=1 
Since ¢x?/é,é. is positive, we have that 
n 
sgn ( > » ra) = sgn p. 
i=1 


Therefore, if form (1) is used, the rule would be to use a minus sign before the 
square root if 


> ry: < 0, 


i=l 


and a plus sign otherwise. In form (2), 


sgn ( ) = sgn 8 
t=1 


if and only if the numerator is positive, which is only when the plus sign is used. 

The least squares estimate of 8 when \=¢,?/c,? is known has appeared inde- 
pendently innumerable times with the earliest appearance in 1879, [26],5 and 
the latest in 1946 [3]. Lindley [27], Tintner [44], and Zucker [53] refer to 
many of these papers. In searching through the literature on this subject, we 


find that form (1) of @ is cited only in [18, 19, 26, 27]. 

One should note, though, that some of the papers referred to in [27] and 
[53] derive the least squares estimate of tan 20, where 8=tan 0, when \=1. 
They find that 


9a A A 
. 2P(X, y)GrGy 
tan 26 = ——__—___ 


6.2 — 6,2 
but do not solve for 8 using the relation 
2 tan 6 28 


tan 26 = - oa 
1 — tan? 6? 1 — B? 





If they had done so, they would have to solve 


(X aw)a + (Lae - Lue)s- Dz =0 
i=l i=1 i=l i=l 
for 8, the same quadratic as above when \=1. 

Pearson [35] was one who estimated tan 2@ but then argued that “the best- 
fitting straight line for the system of points coincides in direction with the 
major axis of the correlation ellipse.” But the direction of the major axis of 
the correlation ellipse depends only on 








§ 1878, [1], when \ was assumed equal to 1. 
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sgn ( > cu). 
i=l 

Hence Pearson’s estimate is equivalent to form (2) when \=1. In none of the 

others papers do I find such an argument. 

This estimate is also well known to econometricians (cf. [10, 25, 44] and also 
[46 ]) in the following guise. Consider the linear relation Yi+-a2Y2+ + - + +amnYm 
=0, where each of the Y,’s is observed with error, i.e., yi;= Y;+ uj; is observed, 

j=1,---,n. Let — be the nXm matrix of y;;’s, let V be the mXm covariance 
matrix of the u,’s, and assume that V* =kV is known. Then if a=(1, de, - - - , dm) 
denotes the maximum likelihood estimate of a=(1, a2, - + + , an), a is the solu- 
tion of the equation (¢’§ —@V*)a’ =0, where @ is the smallest root of |e’t—-0V*| 
=0. 

In our case, 


> yi? > LiYi 


t=1 t=1 


Dz Dd 2? 
i=l i=1 
n n n n 2 n 2 
r > «2 + > 2 =e /(r > 2? - bn x) + a( , x ru) 
i=l Gand i=1 


i=1 i=1 


’ V= 








2d 


and our estimate of 8, —de, is equation (2) above. 

One final note should be made with regard to the literature on least squares 
estimation of 8. As was pointed out earlier, weighted least squares as defined 
in 1.1 is appropriate here. Zucker, though, gives estimates based on various 
other “least squares” approaches, all of which are not consistent estimates of B. 
The estimate she calls k, is equivalent to the “least squares” estimate when 
\=1. The estimate she calls k,_, is equal to k, for functional relations; for 
structural relations it is not a consistent estimate of 8. 
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LINEAR PROGRAMMING TECHNIQUES FOR 
REGRESSION ANALYSIS 


Harvey M. WaGNER* 
Stanford University 


In regression problems alternative criteria of “best fit” to least squares 
are least absolute deviations and least maximum deviations. In this 
paper it is noted that linear programming techniques may be employed 
to solve the latter two problems. In particular, if the linear regression 
relation contains p parameters, minimizing the sum of the absolute 
value of the “vertical” deviations from the regression line is shown to 
reduce to a p equation linear programming model with bounded vari- 
ables; and fitting by the Chebyshev criterion is exhibited to lead to a 
standard-form p+ 1 equation linear programming model. 


1. INTRODUCTION 


arst [7] has suggested recently an iterative procedure “for finding a 
K straight line of best fit to a set of two dimensional points such that the 
sum of the absolute values of the vertical deviations of the points from the line 
is a minimum.” Because linear programming is a relatively new tool to the 
statistician, we offer here an elementary presentation of the known results that 
the multi-dimensional version of Karst’s problem and the model of multiple 
linear regression according to a Chebyshev criterion [6] may be solved directly 
by linear programming methods. Charnes, Cooper, and Ferguson [1] and other 
practitioners of linear programming have recognized that the problem of mini- 
mizing the sum of absolute deviations can be converted into a linear program- 
ming model consisting of k equations, where k is the number of observations, 
and an objective function which calls for minimizing the sum of non-negative 
variables.' First, by employing the fundamental dual theorem [2, 9, 11] in 
linear programming, we shall show how the problem can be handled by a p 
equation linear programming model with bounded variables [3, 4, 12], where 
p is the number of regression coefficients. Secondly, in a manner directly 
analogous to that of? Kelley [8], we shall demonstrate how a regular p+1 
equation linear programming model can be utilized to find a line of best fit 
according to a Chebyshev criterion, 7.e., a line (or hyperplane) which minimizes 
the maximum of the vertical deviations from the sample points. 


2. DUAL LINEAR PROGRAMMING PROBLEMS 


For the purpose of subsequent reference throughout this paper, we give a 
version of Goldman and Tucker’s canonical representation of “dual” linear 
programming problems [5]. The primal model consists of m linear relations in 
n unknowns 2;, in which the relations are partitioned into two mutually ex- 








* This work was sponsored in part by the Office of Naval Research. 

1 In addition to exhibiting the model for the regression problem, Charnes, Cooper, and Ferguson present an 
interesting numerical illustration of the technique in which a nine parameter regression problem is solved by the 
utilization of linear inequalities among less than nine observations. 

2 The author is indebted to a referee for bringing this reference to his attention. 
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clusive and completely exhaustive classes M, and Mz, and similarly the vari- 
ables into two such classes N; and Ne, as specified by (2). 


Maximize cyt; + - - + + Cndn (1) 


subject to the constraints 


<b, for each hin M, (2a) 
=b, for each hin M, (2b) 


chee for each 1 in N, (2c) 
I 


Qntit--+ + tarts 


x 
unrestricted in sign for each 1 in Ne. (2d) 


In particular cases, one or more of the classes M,, M2, Ni, and N2 may be 
empty. 

The corresponding dual model consists of n linear relations in m unknowns 
u, with parallel partitioning 

Minimize wbit+ +--+ + Unbm (3) 

subject to the constraints 
2c, for each / in Ny (4a) 
=c, for each lin N2 (4b) 

(non-negative for each h in M, (4c) 
Un ‘ aS ae ‘ 

alles in sign for each h in M2. (4d) 


H@n + -°: + Am} 


The fundamental dual theorem of linear programming [2, 5, 9, 11] states 
that a set of x,* satisfying (2) is optimal (1) if and only if there is a set of u,* 
satisfying (4) with 

Ct + +++ eet.” = th + --- + Unde. (5) 
Similarly a set of w,* satisfying (4) is optimal (3) if and only if there is a set 
of x,* satisfying (2) with (5) holding. 
3. MINIMIZING THE SUM OF ABSOLUTE DEVIATIONS 


Let 2;;,i=1,2,---,k, andj=1, 2, ---, p, denote a set of k observational 
measurements on p “independent” variables, and y;, i=1, 2, ---, k, denote 
the associated measurement on the “dependent” variable. Note that in the 
case of curvilinear regression, we may have 2,;=2j, or 2;;=log 2,;, or 24; =Wiij, 
etc. We wish to find regression coefficients b; that 


Minimize }>| >> 24; — ys) - (6) 
b; i j | 


Using the reduction in Charnes, Cooper, and Ferguson [1], the problem (6) 
is transformed into 


Minimize . 41; + > €2; (7) 


subject to the constraints 
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D> 2izbj + ei — xs = Yi #=1,2,---,k (8a) 
i 


b; unrestricted in sign (8b) 
€1i, €3 non-negative. (8c) 


We interpret «; and ¢; as vertical deviations “above” and “below” the fitted 

line for the i-th set of observations; i.e., «;+ 2; is the absolute deviation be- 

tween the fit >>; x,,b; and y;. By the nature of the model (7) and (8), e; and «; 

cannot both be strictly positive in an optimal solution. Thus we have formu- 

lated the regression problem as a linear programming model of type (3) and (4). 
The solution to (7) and (8) yields the regression equation 


biti + +++ + dpty = y. (9) 


Note that if we wish the left hand side of (9) to include a coefficient for the 
intercept of the y axis to be determined by the linear fit, then we can let 
r,=1, i.e., let x;,=1 in (8). We may force the fitted line to pass through some 
point, the usual example being the set of sample means, either by adding to (8) 
the linear restriction 


bd +--+ +b,% = 9 (10) 


or by the usual least squares approach of subtracting each coordinate of the 
point, in our example the sample mean for each variable, from all the cor- 
responding observations (including y) and then by fitting (8) without a y- 
intercept coefficient; the latter approach simply consists of shifting the origin 
of the axes in a p-dimensional space to the selected point, and then of fitting 
the line (hyperplane) through the new origin. 

According to (8b) we have not restricted the sign of b;; but we may drop this 
unessential assumption if, in the context of the regression problem, we desire 
to permit only non-negative values for some or all b; or to force the b; to satisfy 
additional linear constraints of greater complexity [1]. 

It is noteworthy that collinearity in the 2z;;, even to the degree that for some 
j and j”, x =2,;, will not cause a failure in the linear programming algorithm 
for (7) and (8) or for any of the models below. 

One drawback of the present model is evident: if the number of observations 
k is sizeable, (7) and (8) become computationally unwieldy. We shall now trans- 
form (7) and (8) into a more manageable dual problem which will yield the 
optimal b; as a byproduct. To preserve generality in our treatment, assume 
the b; are partitioned into the classes M, and M2 according to (4c) and (4d). 
Then the dual relationship given in Section 2 implies that we can find a solution 
to (7) and (8) if and only if we can find a solution to 


Maximize >> y,d; (11) 


subject to the constraints 
<0 for each 7 in M, (12a) 


X wee a for each j in M, (12b) 
dS1 i=1,2,---,k (12c) 
-—d;51 #=1,2,---,k (12d) 
d; unrestricted in sign ; (12e) 
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Model (11) and (12) is even a larger problem than (7) and (8), since it con- 
sists of p+2k relations. To reduce the problem to a model in p relations and k 
bounded variables we let 


fimd:t+1 i=1,2,---,k. (13) 
Then (11) and (12) are equivalent to 
Maximize >> yf: — oy: (14) 


subject to the constraints 
< > zi; for eachj in M, (15a) 


Dd rif 


, = >) 2 for eachjin M; (15b) 


OSKhES 601, %+-+,k (15¢) 


The model now consists of p linear relations (15a) and (15b) in bounded non- 
negative variables (15c), and may be solved quite rapidly for small p(<10) by 
special simplex algorithms for bounded variables problems [3, 4, 12]; such 
algorithms have been coded for nearly all medium and large scale electronic 
computers. Netice that if z,; and y; are deviations of sample values from their 
means, then the right hand side of (15a) and (15b) is zero and the constant in 
(14) is also zero. In the numerical solution of the model, it is customary to 
convert (15a) to equalities by appending non-negative “slack” variables [2] s; 

> wifi t+ 85 = Doz for each j in M,. (16) 
The coefficient of s; in (14) is zero. 

In a manner analogous to the techniques for solving regular linear program- 
ming problems [2, 8, 10], bounded variables algorithms produce an optimal 
“basic” set of variables, 7.e., p of the f; and s; (along with some of the f; at 
their upper bound 2 (15c)) will enter the optimal solution, the values of the re- 
maining variables being zero; we denote these optimal basic variables by »,, 
t=1,2,---,p. Let y;; be the coefficients of the v; in (15b) and (16), and \; the 
corresponding coefficients in (14). Then the regression coefficients b; satisfy 
the relations’ 


Dvd = 1=1,2,---,p. (17) 
i 


No extra computations are needed to find b; from (17). In the original simplex 
method [2, 10], b; appears in the (z;—c,) row of the final simplex tableau; in 
the revised simplex method [10], b; is the “shadow price” for the optimal solu- 
tion of (14) and (15). The optimal value of (14) is the minimized sum of abso- 
lute deviations. 

If one desires to place additional constraints on the b;, as in [1], the effect on 
(15) will be the addition of new variables; the number of relations remains 





* Readers familiar with linear programming will recall that multiple optimal solutions to both the primal and 
dual model may exist. It is easy to verify that if the optimal values for ; in (7) and (8) are unique, then (17) yields 
these values regardless of which alternative optimal solution is found in (14) and (15). If there are various values 
for b; which produce the same optimal value for (7), then the solution to (15) will be degenerate, a condition which 
is present if some of the optimal basic variables in (15) are at their lower or upper bound, i.e., zero or two. 
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unchanged as does the dimension of (17). As suggested by Kelley [8], one may 
also wish to add new “independent” variables to improve the regression fit 
which has the effect of increasing the number of relations in (15). The latter 
problem may be handled by techniques dealing with “secondary constraints” 
[4, 11]. These advanced techniques permit one to add variables to (8), and 
consequently relations to (15), in a specified sequence and to determine the 
improvement without solving (4) and (15) from the beginning each time. 


4. MINIMIZING THE MAXIMUM ABSOLUTE DEVIATION 
We now consider a regression problem which in comparison to the model of 
the previous section contains one more equation but eliminates the “bounded 
variables” restriction. Employing a Chebyshev criterion of best fit, we seek b; 
such that 


oar "he {Maximum ps Zizb; — n|\ ; (18) 
j t j 


Paralleling Kelley’s treatment [8], we transform (18) into the linear pro- 
gramming model 


Minimize « (non-negative) (19) 


subject to the constraints 
—-esS > rb—-ySe t=1,2,---,k. (10) 
j 


In (19) and (20) ¢ is the minimized value of the maximum absolute deviation. 
As before, to preserve generality we assume the b; are partitioned according to 
(4c) and (4d). In preparation for the application of the dual theorem, we expand 
(19) and (20) into 


Minimize e (21) 
subject to the constraints 


= y Lisb; + € ; ] eee 3 (22a) 
> tio; te = yi } sek (22b) 


(non-negative for each j in M, (22c) 
| unrestricted in sign for each 7 in M, (22d) 
€ non-negative. (22e) 
The dual formulation is then 
Maximize — D> ydii + DO yidei (23) 


subject to the constraints 
= tis + Lis / 
ds ~ = 0 for each j in M, (24b) 
> diy; + » dy S 1 (24¢) 


dy;, da; non-negative. (24d) 


<0 for each j in M, (24a) 
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Model (23) and (24) is a regular linear programming problem in p+1 rela- 
tions and may be solved by any standard algorithm [2, 10]. If d,:(d2;) is posi- 
tive in the optimal solution of (23) and (24), then the maximum deviation 
occurs for the i-th sample point, i.e., for the i-th relation in (22a), (22b), and 
this point will lie above (below) the fitted line. 

As with the model in Section 3, we may derive from the optimal solution of 
(23) and (24) the values of the regression coefficients. Assume “slack variables” 
have been added to (24a) and (24c) to convert the relations to equalities, 
analogous to (16). The optimal basic solution consists of p+1 variables, which 
we denote by v,;, 1=1, 2,---, p+1. Let y.; be the coefficients of the v; in 
(24a) and (24b), the former having been converted to equalities, and \; the 
corresponding coefficients in (23). Then the regression coefficients b; and the 
error ¢ satisfy the relations‘ 


Divs te =i *t=1,2,---, pt. (25) 
j 


Once again, no extra computations are needed to find b; and « from (25), since 
these values are automatically computed in the application of the simplex 
algorithms. 

Finally the comments at the end of Section 3 concerning the addition of con- 
straining relations and new “independent” variables are equally applicable 
here.® 


5. A NUMERICAL EXAMPLE 


Karst [7] examines the data in Table 211 which comprise deviations of the 
original data about their sample means. 


TABLE 211 























He finds the least squares fit to be 
y = .539z, 
and the fit for the minimized sum of absolute deviations to be 
y = .6592. (27) 


As we have suggested, (27) may be obtained by (14) and (15), where specifi- 
cally we would find in (17) 

8.5b = 5.6 (28a) 

b = .659. (28b) 





4 We are assuming that we are dealing with the non-trivial case in which e >0. Readers interested in the under- 
lying mathematics of the model may verify that, given this assumption, the slack variable introduced in (24c) does 
not enter an optimal solution at a positive level and that an optimal basic set of variables v; exists yielding (25) 
(5, 9]. 

5 In lieu of a “secondary constraint” algorithm|[ 4, 11] for handling the impact in (24) of additional variables 
in (22), Kelley [8] suggests the alternative of introducing into the enlarged basic solution an “artificial” variable 
with an arbitrarily high cost, and subsequently driving its value to zero, 
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The solution by model (21) and (22) yields for (25) 
— 6.5b + « = 3.6 (29a) 
11.5b + « = 9.6 
b = .333 e = 5.767. (29b) 
That is, the Chebyshev fit is 


y = .3332; (30) 


since the equations (29a) correspond to the optimal variables d2,3 and d2 12, the 
third and last sample point in Table 211 will lie above the fitted line and assume 
the maximum vertical deviation from it of 5.767. 
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COMMENTS ON “THE SIMPLEST SIGNED-RANK TESTS” 


Joun E. Wausn* 
Lockheed Aircraft Corporation 


In a fundamental contribution, Tukey [5] showed that the Wilcoxon 
signed-rank test [14, 15, 16, 17] is equivalent to a subclass of some tests 
presented by Walsh [7, 8]. Since Tukey’s memorandum did not explic- 
itly state that only a subclass of Walsh’s tests is involved, misunder- 
standings seem to have arisen concerning the extent of the equivalence 
between these two classes of tests. This paper points out that a large 
proportion of Walsh’s results are not equivalent to Wilcoxon signed- 
rank tests, and that some of these nonequivalent results have useful 
properties, which can be exploited to obtain tests and confidence inter- 
vals having increased validity, increased efficiency, additional prob- 
ability levels, or requiring less computation, compared to the Wilcoxon 
signed-rank results. 


1. INTRODUCTION 


N THE memorandum report The Simplest Signed-Rank Tests [5], Tukey 
I showed that the Wilcoxon signed-rank test is equivalent to a subclass of 
some tests developed by Walsh [7, 8] for the case of independent observations 
from continuous symmetrical populations with a common central value. From 
u practical viewpoint, this represents almost strict equivalence. Wilcoxon’s 
signed-rank test is based on the assumption of independent observations, sym- 


metry (about zero), and no ties. In practice, this is virtually always equivalent 
to the assumption of independent continuous observations from populations 
that are symmetrical about the same point. Since Walsh’s tests are based on 
confidence intervals for the median of the populations sampled (central median 
value if median not unique), Tukey’s results identify the confidence intervals 
that can be considered to correspond to the Wilcoxon signed-rank tests. 

Although Tukey’s report contains no definite statement to the effect that 
Wilcoxon’s signed-rank test is equivalent to the entire class of tests presented 
by Walsh, this impression seems to exist. For example, in his review [3] of 
Siegel’s book Nonparametric Statistics: for the Behavioral Sciences [4], I. R. 
Savage states, on the basis of Tukey’s memorandum, that the Wilcoxon signed- 
rank test is essentially the same as the “Walsh test.” Actually less than 4 of 
the “Walsh tests” listed in Siegel’s book are equivalent to Wilcoxon signed- 
rank tests. Even when the number of observations does not exceed 15, only a 
very small proportion of the results developed by Walsh are equivalent to 
Wilcoxon signed-rank tests. This proportion decreases as the number of ob- 
servations increases and has a limiting value of zerd. 

Let us identify and examine the subclass of Walsh’s results that is equivalent 
to the Wilcoxon signed-rank tests. The data consist of n independent observa- 
tions from n possibly different continuous populations all of which are sym- 
metrical about the same unknown value ¢. The problem is to test the null 
hypothesis that ¢=¢o, where ¢o is a specified value. Throughout this paper, 


* The author is now with The System Development Corporation, Santa Monica, California. 
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%1<%2< +--+ <2, will be used to represent the ordered values of the observa- 
tions. In all, n(n+1)/2 distinct averages of the type (2;+2,;)/2 can be formed. 
Let the ordered values of these averages be denoted by yi<ye< +--+ <Yncn4n/2- 
Then Tukey [5] showed that the Wilcoxon signed-rank tests are equivalent 
to the subclass of Walsh’s tests [7, 8] that can be expressed in one of the forms 


Accept > ¢o tf Yr > oo. 
Accept ¢ < $0 tf ys < oo. 
Accept @ # $0 if either y, > oo or ys < do. (r < 8) 


That is, the Walsh results that are equivalent to Wilcoxon signed-rank tests 
are those based on single order statistics of the entire set of n(n+1)/2 averages 
of pairs of ordered observations. 

Walsh’s results are not restricted to the entire set of averages. Tests can 
also be derived for any specified subset of these averages. Let zi< - +--+ <z 
be the values of any given subset of h of the (2;+2;)/2, where 1>j and 
h<n(n+1)/2. Then exact tests with critical regions of the forms 


Zr > oo, Ze < do, Zr > oo Or 2, < go (r < 8) 


can be obtained. The totality of tests of this type, and the corresponding con- 
fidence intervals, constitute Walsh’s results. 

Only a few of the y’s, individually equivalent to the Wilcoxon signed-rank 
tests, can be expressed as uncomplicated functions of the z’s. Namely, 


Yi = My, Y2 = (x2 + 11)/2, ys = min [z2, (v3 + 1)/2] 


Yn(n+l)/2 = Try Ya(nt+)2-1 = (tn + Zn-1)/2 


Yn(ntl)/2-2 = Max [z.-1, (In + tn-2)/2]. 


The remaining y’s consist of more complicated functions of the z’s. For exam- 
ple, 


y4 = next to smallest of (73 + 21)/2, (a4 + 21)/2. 


Less complicated functions based on averages of pairs of order statistics, such as 
(vx +a1)/2, (tn +2n-4)/2, min [z2, (x4 + 21)/2] 


can provide tests and confidence intervals not equivalent to Wilcoxon tests. 
Uncomplicated functions of the z’s often have computational advantages and, 
if carefully chosen, can have other desirable properties. 

The tests and confidence intervals that are equivalent to Wilcoxon signed- 
rank tests (referred to as equivalent results”) may well be the most important 
subclass of the results developed by Walsh [7, 8]. However, in some cases 
appropriate nonequivalent tests and confidence intervals can be used to advan- 
tage. The next section, Computational Advantages, points out that use of 
suitable, nonequivalent results can require less calculation, especially in deter- 
mining confidence intervals. The following section, Additional Probability 
Levels, calls attention to the fact that the number of exact significance levels 
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and confidence coefficients available for the equivalent results can be aug- 
mented by use of nonequivalent results. The fourth section, Bounded Probabil- 
ity Level Results, points out that bounded significance level tests and con- 
fidence intervals with bounded confidence coefficients are obtainable on the 
basis of some of the nonequivalent results. These tests and intervals have 
probability levels that are approximately valid (with specified bounds) for 
very general classes of continuous populations; symmetry is not required. The 
final section, Advantages for Special Situations, showed that in some cases large 
gains in efficiency can be obtained by use of suitable nonequivalent results, 
rather than equivalent results, if additional general nature information about 
the populations is available. Also nonequivalent results can be adapted to test 
other population properties such as symmetry in the tails, and to furnish a 
basis for nonparametric rejection of outlying observations for symmetrical pop- 
ulations. 

A possible disadvantage of Walsh’s class of results is that it is of such a great 
extent that several candidates are available for the same job. Thus the statisti- 
cian may unconsciously pick that procedure that leads to the conclusion he 
favors. This type of difficulty, which is often encountered in the statistics field, 
can be avoided by choosing the procedure independently of the data. Namely, 
the statistician chooses the statistical procedure to be applied before he has 
knowledge of the values of the observations. 


2. COMPUTATIONAL ADVANTAGES 


A desirable property of the Wilcoxon signed-rank method is that excessive 
computation is not required. In some cases, however, the amount of computa- 
tion can be noticeably reduced by using a suitable nonequivalent procedure 
rather than the Wilcoxon method. This computational advantage is specially 
strong for the case of confidence intervals and for the case of tests with inter- 
polated significance levels [5]. 

Consider first the case of significance tests with exact significance levels. To 
apply the Wilcoxon signed-rank method, the n absolute values of 11—@,, - - -, 
In— must be ranked. That is, first ¢, must be subtracted from each observed 
value and then all the absolute values of the resulting numbers ranked. Also 
the sign of each such difference, x;—¢., must be noted. Often determination 
of a few of the order statistics of the observations is easier than such an evalua- 
tion and the ranking of all the absolute values. Once the pertinent order statis- 
tics are determined, the additional computation required is rot very great. 
If the graphical method of Walsh [9] is used, no further computation is required 
for application of an order statistic test of the type listed in Siegel’s book. In 
many instances, use of one of a set of tests based on uncomplicated functions 
of no more than four order statistics in each tail will require less computation 
than use of the corresponding Wilcoxon signed-rank test. 

The following numerical example is given to illustrate the computational 
effort involved for a one-sided nonequivalent test based on four order statistics 
and the corresponding Wilcoxon signed-rank test. A significance level of ap- 
proximately 0.0435 is desired and n=10. The null hypothesis asserts that 
¢=5.66 and the alternative hypothesis of interest is ¢ <<5.66. The observations, 
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which are believed to be statistically independent and from continuous sym- 
metrical populations with common central value ¢, have the values 


6.50, 7.83, 2.67, 4.02, 4.13, 5.81, 9.22, 5.33, 3.99, 1.11 


A nonequivalent test with significance level 44/1024 (see Table 1) is 


Accept ¢ < 5.66 if max [(ag + 27)/2, (x10 + 23)/2] < 5.66. 
Inspection shows that 23=4.02, z7=5.81, z3=6.50, 210=9.22. Thus 
(2s + X7)/2 = 6.155, (210 + X3)/2 = 6.62. 


Since 6.62 is not less than 5.66, the null hypothesis is not rejected in favor of 
the alternative of interest. Now consider the corresponding Wilcoxon signed- 
rank test, for which the exact significance level is 43/1024. First 5.66 is sub- 
tracted from each observation, yielding the values 


0.84, 2.17, —2.99, —1.64, —1.53, 0.15, 3.56, —0.33, —1.67, —5.55 


Next the absolute values of these numbers are ranked from 1 to 10, giving the 
ranks (with the sign of the observations stated in parentheses) 


3(+), 7(+), 8(—), 5(—), 4(—), 1+), 9(+), 2(—), 6(—), 10(—) 


Thus the sum of the ranks which correspond to observations with plus signs is 
20. Since this sum is not less than critical level for n=10 and a 43/1024 sig- 
nificance level [16, 17], the null hypothesis of ¢=5.66 is not rejected in favor 
of the alternative ¢< 5.66. 

The computational advantages of the nonequivalent results are more appar- 
ent for the case of determining confidence intervals for ¢. Here, to apply Wil- 
coxon techniques, at least a moderate number of the z’s must be evaluated 
and then the pertinent y (or y’s) determined. This very frequently requires 
more computation than determination of a few of the z’s and evaluation of an 
uncomplicated function of these z’s. 

Finally, let us consider results with interpolated probability levels. These 
interpolated probability levels have the property of being bounded between 
two exactly specified vaiues for the case of independent observations from con- 
tinuous symmetrical populations with a common central median. Tests and 
confidence intervals of this type were introduced by Tukey [5]. These results 
are useful for cases where the available exact probability levels are not suffi- 
ciently close to the desired values. All of Tukey’s interpolated results are based 
on Wilcoxon signed-rank tests and their equivalent form in terms of order 
statistics (t.e., using the y’s). 

For the interpolated probability level case, computation can nearly always 
be reduced without much loss of efficiency on the basis of nonequivalent results. 
As an example [10], 

: l-—a a 1 

Accept ~ > do if min (x, —_ Le+1 oo > a y n) > do, (0 <6 < 1), 
has an interpolated significance level of (k+1—a)/2". If the n independent 
observations are from continuous symmetrical populations with common cen- 
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tral median ¢, the true significance level of this test is bounded between k/2* 
and (k+1)/2". Similarly, 

; 1 a l+a 
Accept ¢ < ¢oif max (x, 2 ta + ry Lati-k , rt) < do 


has an interpolated significance level of (k+1—a)/2" and bounds of k/2", 
(k+1)/2". Also the two-sided test 


l-—a a 1 
Accept ¢ + ¢» if either min (x, — Leu t+ >t + 2 x) > do 


1 a l-—a 
or max (2, 9 * + > fai + or nz} < Go 


has an interpolated level of (k4-i—a)/2"-' and bounds of k/2"~!, (k+1)/2"-". 
If k is not too large (say k<3n/4), computations [7, 8] indicate that these 
three interpolated tests, and the cerresponding confidence intervals, have rea- 
sonably high efficiencies, at least for the case of a sample from a normal popula- 
tion, 


3. ADDITIONAL PROBABILITY LEVELS 


For the results [7, 2] and the case of n observations, all the exact probability 
levels are of the form ¢/2*, where ¢ is an integer. Relatively small values of ¢ 
are of major interest for significance levels and relatively large values for con- 
fidence coefficients. Unless n is rather large, all values of ¢ less than 2"/10 and 
greater than 9(2")/10 usually can be of interest for one-sided results. However. 
the exact probability levels for the Wilcoxon signed-rank results do not include 
a large proportion of these values of ¢ except for very small n. For example, 
when n=10 the one-sided signed-rank tests only have significance levels with 
t=1, 2, 3, 5, 7, 10, 14, 19, 25, 33, 43, 54, 67, 82, 99 [5]. That is, 87 of the 102 
values for ¢ in the range 1<t<2*/10, a total of over 85 per cent, are not as- 
sumed by the Wilcoxon one-sided test for n=10. Fortunately, these “gaps” in 
exact probability levels can be filled in to a great extent by the use of suitable 
nonequivalent tests, not only yielding a larger selection of exact probability 
levels but also furnishing a basis for obtaining tests witii still more accurate 
interpolated significance levels. A further step might be to replace some of the 
equivalent tests by preferable nonequivalent tests having these same signifi- 
cance levels. 

Some general probability relations that are useful for filling in probability 
level “gaps” for both tests and confidence intervals are given in Table 218. These 
relations are valid for the case of n independent observations from continuous 
symmetrical populations with common central median ¢. The power function 
computations [7, 8] indicate that the tests and confidence intervals obtained 
from Table 100 have reasonably high efficiencies if the value of k is not too large, 
at least for the case of normality. As examples of the application of these gen- 
eral relations, for n=10 
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has significance level 8/1024 =8/2", 
Accept @ < ¢o if max [ze¢, (aio + 25)/2] < oo 
has significance level 31/1024, and 
Accept ¢ < ¢o if max [(xs + 27)/2, (10 + 24)/2] < oo 


has significance level 32/1024. Examination shows that use of Table 100 for 
n=10 and k<8 furnishes the additional values of t=4, 6, 7, 8, 11, 12, 15, 16, 
20, 22, 24, 26, 29, 31, 32, 42, 44, 49, 57, 63, 64. 


4, BOUNDED PROBABILITY LEVEL RESULTS 


A more general problem than that handled by the Wilcoxon signed-rank 
test and Walsh’s results [7, 8] is to obtain tests and confidence intervals for 
the common median of arbitrary continuous populations on the basis of inde- 
pendent observations from these populations. (The populations are no longer 
required to be symmetrical.) The sign test procedure [e.g., 1] furnishes a gen- 
eral solution to this problem in the sense that exact probability levels are ob- 
tained. However, investigation indicates that nearly all procedures of the sign 
test type have moderate or low efficiencies [1, 2, 6], at least for the normality , 
case. The only practically important exceptions occur when the test or con- 
fidence interval is based on the largest and/or smallest of the observation 
values. Consequently, the situations (numbers of observations, significance 
levels, and confidence coefficients) available for efficient sign test type pro- 
cedures are very limited. The resulting “gaps” in numbers of observations and 
probability levels can be partially filled in with approximate tests and con- 
fidence intervals that appear to be reasonably efficient [13]. These procedures 
have approximate probabillty levels, which are not exactly determined but, 
under very general conditions, are bounded, from above and below, by definite 
values. 

The nonequivalent results [7, 8] furnish a useful basis for developing tests 
and confidence intervals with probability level bounds which are (at least mod- 
erately) close together. Two different situations are important. In one, the 
observations, which must be independent, may come from different continuous 
populations (not necessarily symmetrical) with a common median. In the 
other, the observations are required to be a sample from a single continuous 
population which need not be symmetrical. In the following paragraph, ab- 
solute probability level bounds are determined for the first situation. A param- 
eter is introduced for measuring the deviation from symmetry for the popula- 
tion sampled in the second situation [13], and bounds for the probability levels 
are stated as functions of this parameter. (The type of parameter used for one- 
sided procedures is different from that used for two-sided procedures.) 

First let us consider the case of n independent observations from possibly 
different continuous populations with a common median ¢. Then for 1 <k<n—1, 
0<a<1, and 0<b<1, 
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a 


l-—a a 1 
Pr( > @) < Pr{min Z2, 9 nu +m +a] > of 


< Pr(z2 > ¢) 





r 1 a l-a 
Pr(z, < ¢) < Pr {imax Ln-1) 2 La oe ry Ln+i—k < er ta] % of 


< Pr(tn-1 < ¢) 
Pr(z1 > >) < Pr{(1 ae b)ate + ba > ¢| < Pr(xe > ¢) 
Pr(z, < ¢) < Pribzx, + (1 — b)an1 < @)Pr(xn-1 < ¢). 


The upper and lower limits are sign test probabilities, and therefore exactly 
determined for this situation. In fact, under the assumed conditions 


Pr(zi1 > ¢) = Pr(z, < ¢) = 1/2", = Pr(zz > 6) = Pr(aa-i < 6) = (n + 1)/2". 


The power function calculations [7, 8] indicate that tests and confidence inter- 
vals based on the interpolated expressions have reasonably high efficiencies, 
at least for normality, if k and 1—b are not too large (say, k<3n/4 and b>1/4). 
By suitable selection of k and a, or b, procedures can be obtained in most prac- 
tical situations which have interpolated probability levels reasonably near 
chosen values, and whose bounds are at least moderately close together. 

The special case where the populations are symmetrical and ¢ is the central 
median for all the populations serves as a basis for the interpolation procedure. 
Probabilities are exactly determined for all values of k if a=0 or 1; also for 
b=0, 1/2, 1. The interpolated probability value for either expression involving 
k and a is (k+1—a)/2". By suitable selection of k and a, tests and confidence 
intervals with specified interpolated probability levels can be easily obtained. 
Thus, for n=7 (and k=2, a=.44) 


min [2e, .2873 — .2272 — .50m] < ¢ 


is a one-sided confidence interval for @ with an interpolated confidence coeffi- 
cient of .98, which should be very close to the true value for symmetrical popu- 
lations. It should be moderately close to the true value for a large class of 
populations. In no case, however, will the actual confidence coefficient exceed 
.9922 or fall below .9375. 

Interpolated probability levels for both probability expressions involving b 
can be obtained by graphical interpolation based on the values of (n+1)/2" 
for b=0, 2/2" for b=1/2, and 1/2" for n=1. Thus, for n=6, both 


Pr(.6722 + .332; > ¢) and Pr(.33x_ + .67z5 < ¢) 


have an interpolated value of .05. If the populations are symmetrical, these 
probabilities should be near .05, and indeed, should not be far from .05 for a 
wide range of populations. (They never fall outside of the interval .0156 to 
.1092.) 

Finally, let us consider the case where the observations are a sample from a 
continuous population, not necessarily symmetrical, with median ¢. The addi- 
tignal knowledge that the observations are from the same population allows 
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more refined types of bounds to be obtained for probability levels. Specifi cally, 
limits for the values of 


, l-—« a l 
Pr {min | x, = Liga + 3% + =| > of 


P { | Carole gue | < 6} 
r4max | 2n-1, — 2n + — Zapi— Sen 
1 2 2 +1—k 2 k 


can be stated as a function of a parameter which measures the deviation of the 
population sampled from symmetry. The situations where a=0, k<5 and 
a=1, k=6 have been considered [13] for n<12. Bounds corresponding to 
intermediate values of a can be approximately determined by linear interpola- 
tion. If the deviation from symmetry is only moderate, the two bounds are 
quite close together. Even if the distribution sampled is as unsymmetrical as 
the single (or truncated) exponential distribution (i.e., probability density 
function =e~* for 0<a2< ©, and =0 otherwise), the upper and lower limits are 
still reasonably close together. For 7 observations, so long as the deviation from 
symmetry is no worse than that for the exponential population, 


02 < Pr{min [z2, (xy + 21)/2] > ¢} < .05. 


The absolute limits, for all possible deviations from symmetry, are .0078 and 
.0625. 

The two-sided tests and confidence intervals considered are those correspond- 
ing to probabilities of the form 


1-—b b 1 b 1-—b 
Pr( 3 aa + — aes erro to n+ > tas + tes) 


where k<6, 0<b<1, and n<12. Computations [7, 8] indicate that the cor- 
respondng tests and confidence intervals have reasonably high efficiencies, at 
least for normality. Upper and lower limits for the value of this probability 
have been evaluated [13] for b=0, 1. The bounds corresponding to intermedi- 
ate values of b can be approximated by linear interpolation. This probability 
has an upper limit of 1—(1/2*-*-') for b=0 and of 1—(1/2*"-*) for b=1. (In 
both of these situations the value for symmetry is an upper bound for all pos- 
sible values!) For k=2, the absolute bounds for the two-sided case are much 
closer than for the one-sided cases with the same probability level. However, 
for k=3, these absolute bounds are quite far apart, with the closeness decreas- 
ing rapidly as k increases. Fortunately a large deviation from symmetry is re- 
quired to move the conditional lower bound a substantial distance from the 
absolute upper bound. As an example, let n=8. Then 


895 < Pr[(x, + 21)/2 < @ < (xs + 25)/2] < .9375 


if the deviation from symmetry is no greater than that for a single exponential 
population. The absolute bounds for this case are .637 and .9375. 
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5. ADVANTAGES FOR SPECIAL SITUATIONS 


When populations have finite ranges, the highest efficiencies tend to come 
from tests and confidence intervals based exclusively on extreme observations. 
This is nearly always the case when the observations are a sample (i.e., are 
from the same population). As a consequence, for n at least moderately large, 
extreme observations usually have much smaller variances than other (ordered) 
observations. Thus, when sampling from a rectangular distribution, for exam- 
ple, the variances of the extreme observations are of order 1/n? while the vari- 
ances of the central observations are only of order 1/n. Examination shows 
that the Wilcoxon signed-rank results are based predominantly on central ob- 
servations. Consequently, if nonequivalent results that depend entirely on ex- 
treme observations are available, they would seem to be preferable to Wilcoxon 
signed-rank results when the symmetrical populations are known to have finite 
ranges. The first part of the present section calls attention to some nonequiva- 
lent results for investigating @¢ which depend only on extreme order statistics. 

A class of tests and confidence intervals that are based exclusively on extreme 
observations has been developed [11]. The probability levels for these non- 
equivalent results are exact if the n independent observations are from any 
continuous symmetrical populations with a common central median. In par- 
ticular, for n>k 


Pr[(xny1-~ + 21)/2 > @] = Pr[(xn + rx)/2 < o] = 1/2+. 


Additional results of an uncomplicated nature have also been given [11, Table 


1]. When the observations are a sample, the extreme observation tests and 
confidence intervals [11] should have reasonably high efficiencies and should be 
more efficient than the Wilcoxon signed-rank results. 

The extreme order statistic results [11] form the basis of some tests of 
hypotheses that do not involve the value of ¢. Two types of situations are 
there considered: 

(a) The observations are known to be independent and from continuous 
symmetrical populations. The null hypothesis asserts that the popula- 
tions have a common central median. One alternative hypothesis of 
interest asserts that a specified number of the largest observations are 
too large to be consistent with the null hypothesis. Another alternative 
hypothesis of interest states that a specified number of the smallest ob- 
servations are too small to be consistent with the null hypothesis. Two- 
sided alternatives that are combinations of these two one-sided hypoth- 
eses are also of interest. 

The observations are known to be independent and from continuous 
populations with a common central median. The null hypothesis asserts 
that the populations are symmetrical about this common median. The 
alternative hypothesis is that not all of the populations are symmetrical 
about this common value with emphasis on lack of symmetry in the tails 
of the populations. 

Thus situation (a) is concerned with nonparametric rejection of outlying ob- 

servations while situation (b) considers symmetry in the tails of populations. 
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General statements of the tests derived for (a) and for (b) plus a discussion of 
their properties have been given by Walsh [11, 12]. 

Only the most elementary tests for (a) and (b) will be mentioned here. Let 
W(n, k) be the smallest integer such that 


Pr(twinn < ¢) < 1/2* = ay. 


Situation (a) is considered first. The alternative that the k largest observations 
are too large to be consistent with the hypothesis of a common central median 
is accepted if 


(Lngi-k + 21)/2 > Twin): 


Subject to some weak restrictions on the populations from which the observa- 
tions were drawn, the significance level of this test tends to a, as n increases. 
For no value of n, however, does this significance level exceed 2a;. Also the test 
has favorable consistency properties with respect to deviations from the null 
hypothesis [11, 12]. The alternative that the k smallest observations are too 
small is accepted if 


(tn + tz)/2 < Lnsi-win,h- 


This second test has the same significance level and consistency properties as 
the test for whether the largest observations are too large. Finally, consider the 
two-sided test of whether the largest observations are too large or the smallest 
observations are too small. The two-sided alternative that either the k largest 
observations are too large or the k smallest observations are too small is ac- 
cepted if either 


(tng1—k +.21)/2 > twa Or (tn + tz)/2 < Lngi—_wen)- 


The significance level of this test tends to 2a, as n increases and never exceeds 
4a,. In applying these nonparametric tests for rejecting outlying observations, 
it is important that k be chosen without knowledge of the values of the observa- 
tions. 

For situation (b), the alternative hypothesis that not all of the populations 
are symmetrical in their tails, about a common median value, is accepted if 
either 


(Tn41—% + 21)/2 > Twiny OF (Xn + e)/2Z < Tayi—win,h)- 


This test has a significance level that tends to 2a, as n increases but this sig- 
nificance level never exceeds 4a;. The tests for situation (b) have the same 
general consistency properties as the tests for situation (a). 
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SIMPLIFIED BETA-APPROXIMATIONS TO THE 
KRUSKAL-WALLIS H TEST 


Davip L. WALLACE 
University of Chicago 


A Beta-approximation, commonly used to approximate permutation 
test distributions in the analysis of variance, is proposed for the null 
distribution of the Kruskal-Wallis H-statistic for one-way analysis of 
variance of ranks. The approximation seems slightly simpler and better 
than the Beta-approximation given by Kruskal and Wallis, particularly 
in bringing the H-test into closer relation to ordinary analysis of 
variance tests. Simple conditions on the group sizes allow further sub- 
stantial simplifications in the approximations. Numerical comparisons 
for very small samples illustrate the various approximations. 


1. THE KRUSKAL-WALLIS APPROXIMATION 


UPPOSE the observations in a one-way analysis of variance table with C 
S samples of n;, - - +, Nc observations, respectively, have been replaced by 
ranks 1, 2, - - -, N= }>n;. Denote by R; the sum of ranks in the ith sample. 
Kruskal and Wallis [1] introduce the statistic 


gin nrc t “ beer hes e 

for testing the hypothesis that all samples come from the same population. 

Under the null hypothesis that the samples come from identical continuous 
populations, H has an approximate x? distribution with C—1 degrees of free- 
dom, provided the n; are not too small. Kruskal and Wallis give two other 
approximation distributions, the best (and most complicated) being a Beta- 
approximation to the disteibution of B,=H/M, with M the maximum possible 
value of H. The two parameters are chosen so that the means and variances 
agree. The approximation is generally more easily used in the form of an 
F-approximation. The statistic 


_ H[M -(C-1)] 
~ (C —1)(M — BH) 





has an approximate F-distribution with degrees of freedom fi, fz equal to twice 
the respective Beta parameters. In section 6.2 of the paper [1], M, fi, fe are 
given along with the Paulson approximation of the F-distribution by the normal 
distribution. This latter is most useful since the degrees of freedom are rarely 
integers. 


2. THE PROPOSED APPROXIMATIONS 


If one performs an ordinary, one-way analysis of variance on the ranks, de- 
noting the sums of squares between and within by SSB and SSW, respectively, 
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Cc. R22 N(N +1)? 
SSB = >> _ 3 + i)” 
1 nj 4 


N(N? — 1) 


’ 


then the usual analysis of variance F and B statistics are 
SSB H 
~ SSB+SSW N—1’ 
(VN —C) SSB (N —C)H 
~ (C-1) SSW (C-1)(N-1-4H). 





2 








2 


Three Beta (Ff) approximations to the distribution of Bz, (F:) will be proposed. 
Although the normal theory Beta and F distributions, with degrees of freedom 
C—1 and N—C, are known to hold fairly well for moderately nonnormal data 
(indeed, the xc_.* approximation to H is equivalent to the normal theory ap- 
proximation in the F-form but with N taken to be infinity in the definition of 
F and in the degrees of freedom), the approximation is improved by adjusting 
the degrees of freedom to allow for the nonnormality of the data. The first 
approximation proposed is the special case for ranks of an approximation ob- 
tained in general by modifying the degrees of freedom of the fitted Beta distri- 
bution so that its mean and variance agree with the exact mean and variance of 
B,. The factor d needed for this modification is given as equation (35) with 
k*/k2=—1.2 of Box and Andersen [2], who also give other results and 
references. 

I. Approximate the distributions of Bz and F2 respectively by Beta and F dis- 
tributions with degrees of freedom fi!, fo’ (Beta parameters f/2, fo!/2) given by 


fi _ (C = 1)d;, fo¥ = (N = C)d; 
with the adjustment factor d given by 





—  NW+) ( 
~ 2(C — 1)(N — C) 


Approximation I of B, is similar to the B,; approximation of Kruskal and 
Wallis [1]. dr can be written as 
(VN -—C)(C-—1)-—V 


dy = — 
(NV —1V 





in which V is the variance (under the null hypothesis) of H, given in equation 
(6.2) of the paper [1]. (This latter form is easily obtained directly by fitting 
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moments and can be rewritten in the form (1) which shows more clearly the 
dependence on the sample sizes.) In the latter form, the B, approximation is 
seen to be exactly the same as the B.—I approximation with N—1 replaced 
throughout by M, the maximum possible value of H. 

M was used so that the mean, variance and true maximum would agree. 
But this last adjustment causes some slight trouble. The most extreme ranking 
(the C samples completely separated) has the value B,=1 or F;=«, and the 
approximate significance level 0.! There seems to be a related tendency to 
underestimate the significance level for extreme results. In very small samples, 
results at or near the extreme can occur, under the null hypothesis, with sub- 
stantial probability. 

Limited numerical comparisons (see section 3) on small samples indicate that 
the B, approximation is better in the extreme tails and about the same in 
general. Thus, the B, approximation seems to have slight advantages in ac- 
curacy, in simplicity through elimination of the quantity M, and in its relation 
to the ordinary analysis of variance and the more general permutation theory 
results. 

The chief disadvantage of either the B, or the Bz;—J approximation is the 
computational difficulty of getting the degrees of freedom (fi, f2) or fi, f2!). 
Two choices of the adjustment factor d that eliminate this disadvantage and 
are sometimes appropriate are: 

6 N+1 1 1.2 


i. én = 1 = — — x 
5 N-1N+12 W-1 


and fi! = (C—1)dn, fa’ =(N —C)du, 
Ill. dyx = 1 and f-™ =C-— .. felt! =N-€C. 





Approximation II is that obtained by calculating d; as if ail sample sizes 
were equal to the average sample size i= N/C. It is exactly the same as I for 
equal sample sizes and is quite close to I when the sample sizes do not vary 
too greatly. 

Approximation III is just the ordinary, normal theory analysis of variance 
test on the ranked data, and hence involves no adjustments to degrees of 
freedom and no interpolation to fractional degrees of freedom. However, for 
small samples, particularly with nearly equal sample sizes, this approximation 
introduces substantial errors. The following results bound the changes in sig- 
nificance levels from using B; approximations II or III instead of I, and provide 
a guide to using the simpler approximations. 

For C>3, for significance levels between .20 and .005 and for ranges of sample 
sizes bounded by: 


min. n; 4 | 6 7 








17 26 39 


| | 
(a) max. n; | 8 | 15 20 
| 











or (b) max. nx | 12 





1 This behavior was pointed out to the author by K. A. Brownlee of the University of Chicago. 
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the significance level ar computed using I is always smaller (more extreme) than 
the level arr using II and the difference is bounded by: 


20 .159 -10 -05 ‘ 01 -005 


ail 








or (6) an —ar -0115 -0118 -0109 -0082 -0056 -0031 -0018 


ba 
(a) an —ay | .0054 -0055 -0051 -0039 ‘ -0015 -0009 




















The upper bounds on ar1— az apply also with the level ar1r using III replacing 
arr. arrr—ar can be negative. Bounds (a) apply to ar—arrr if the total sample 
size N>50; bounds (b) apply if N>27. 

These results were obtained by bounding the values of Q (Q=0 for II and 
Q=1 for III) and bounding the effect of the resulting change in d on the sig- 
nificance level. Bounds (a) correspond to a maximum change in d of .0278 and 
bounds (b) to .0594. The Paulson approximation to the F-distribution was used 
as sufficiently accurate for differences in level. The derivations are given in the 
report [3]. 

The B,, B:—I, and B,—I‘I approximations all involve fractional degrees of 
freedom and hence require interpolation of some kind in Beta or F tables. For 
most practical use, visual approximate interpolation will suffice. For increasing 
levels of accuracy, one might use the Paulson approximation to F giving an 


equivalent normal deviate 
2 2 
Br} 
Ofe of 1 


[= 2 1/2 
= 
fs Of; 
or one could use the deviate Kp adjusted by its error computed at the nearest 


exact F quantile, or an adjustment obtained by interpolating in the errors at 
several near exact points. 





3. NUMERICAL COMPARISON OF THE APPROXIMATIONS 


The exact distribution of H is available for three samples of size up to five.? 
In table 229, the levels using approximations B,, B.—I, B,;—II, and B,—III 
and the Kruskal-Wallis x? approximation are compared with the exact proba- 
bilities for the nearest attainable values of H above and below the levels of .20, 
10, .05, .01, .005, .001 for three samples of sizes (3, 3, 3), (3, 4, 5), (5, 5, 5). 
The Paulson approximation, adjusted by its error at the nearest exact quantile 
of F, was used in all cases except the B,—III approximations. The B,—III 
values were obtained from the Beta tables. The sample sizes are much too 
small for the B;—III approximation to be seriously considered. 





2 I am indebted to W. H. Kruskal and W. A. Wallis of the University of Chicago for furnishing these tables, 
as well as for helpful suggestions. 





TABLE 229 


COMPARISON OF THE B,, B:—I, B:—II, B:—III and x? APPROXIMATE SIG- 
NIFICANCE LEVELS OF H WITH THE EXACT LEVEL. (ADJUSTED 
PAULSON APPROXIMATION USED WITH &B,, B:—I, B:—I1.) 








Exact Approximate Minus Exact Probability 

Sample Proba- 
Sizes bility 

P(H 2 Hy) 





2 
B, Bet ima | oe ienigtee 





-2321 ‘ -0171 -0279 | —.0388 
.1964 ‘ -0025 -0144 | —.0253 


. 1000 : .0113 .0247 -0008 
-0857 ‘ .0241 .0364 -0062 


.0500 ‘ .0136 .0230 -0108 
.0286 : .0043 .0044 -0296 


-0107 ‘ -0001 -0040 .0283 
.0036 . .0013 .0026 .0237 





2040 | +. 0004 0044 .0128 
.1992 0045 | +. .0004 | —.0086 


- 1035 : — .0039 . -0112 
.0991 . -0010 ‘ -0084 


.0008 
-0038 


-0507 ‘ .0050 : -0110 
.0490 ‘ -0041 ‘ .0102 


-0092 
-0102 


-0109 ‘ .0020 ; -0043 
-0097 ; .0014 ; -0035 


-0140 
-0145 


.0051 ‘ .0004 ‘ .0018 
-0050 ; -0005 


-0141 
.0140 


.0012 ‘ .0003 
-0009 , + .0002 


.0122 
.0114 
. 2009 ‘ -0057 ‘ .0163 
-1905 ‘ .0006 -00- -0096 


++ ++ ++ +4 +4 





-1015 ‘ -0018 
-0995 ‘ -0001 


-0040 
.0029 


-0509 : .0008 
.0488 ‘ -0026 


.0082 
-0068 


.0105 ; -0023 
.0094 : -0013 


.0080 
-0089 


.0052 -0020 -0012 
-0050 .002 -0013 


-0076 
.0074 


.0012 P -0001 
.0010 ‘ -G001 


-0063 
-0060 























[++ ++ ++ 4+4+ +4 





| 
| 
| 
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A PRODUCTION MODEL AND CONTINUOUS SAMPLING PLAN 


' 
I. Ricuarp SavaGe* 
University of Minnesota 


A production model is considered where the quality of output de- 
creases until corrective action is taken and then the cycle is repeated. 
For this model, the sampling plan of examining each Fth item and tak- 
ing corrective action whenever a defective is found, is evaluated. Illus- 
trations are presented of the production model and choice of sampling 
plan, F. The relationship of this model and plan to earlier ones is dis- 
cussed. Specifically, the probability of producing a good item after ¢ 
units of production is assumed to be PR*, where z(t) is the value of a 
Poisson process and represents the change (from good to bad) in the 
production process since “trouble-shooting.” P and R are “quality” 
parameters satisfying O<PS1 and OSR<1. Costs, such as that of 
looking for and removing trouble, and of inspection, are introduced. 
The average income is maximized by the choice of F. 


0. INTRODUCTION AND SUMMARY 


HE continuous sampling plan which calls for inspecting each Fth item pro- 

duced and stopping production on locating a defective item, is evaluated 
for a model of production that allows the quality of output to vary with time.! 
The approach differs from most work in this field in the following respects: 


(1) The continuous sampling plan incorporates an explicit rule for stopping 


production and “trouble-shooting”, 7.e., looking for trouble in the process 
and removing any if found. 

(2) The plan is assessed not through its AOQ function but through income 
when using the plan. 

(3) The production process is not always at the same quality level. Instead 
it is assumed that quality will change in a specified manner unless action 
is taken. 


The nature of the plan can be indicated by a simple example. Consider the 
following situation: Items are being produced one after another and we apply 
the Sampling Plan: Every Fth item produced is inspected. If the item is found 
good, production is continued, and if it is found bad, production is stopped and 
trouble in the process is removed if present. At each stop the time required for 
“trouble-shooting”—seeking the trouble and repairing it if there is any—is A 
units and the cost of trouble-shooting is B units. 

Production Process States: The production process states incorporate the mech- 
anism effecting the decrease in output quality as the time from trouble-shooting 
increases. The state of the production process, ¢ units of time after trouble- 





* This work was done at Stanford University, sponsored in part by the Office of Naval Research. It constitutes 
@ revision of reference [23]. 

1 Sampling plans which call for stopping production at the first sign of trouble are like most other plans [3, 4, 5, 
6, 7, 14, 22] that have been suggested, in that the detection of each defective causes a modification in the sampling 
for the immediate future. The action is seldom so severe (see Duncan's discussion [7] of control charts where the 
action is as severe), usually consisting of more frequent inspections. 
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shooting, is given by 2(t), which is a chance quantity and is assumed to have a 
Poisson distribution with parameter At. The larger the values of z(t), the greater 
the probability of bad items. 

Quality Process: If the production process state is 2(¢), then the probability 
of producing a good item at tis R*. The constant R measures the decay rate 
of the quality. It is assumed that 0S R<1. Thus, soon after trouble-shooting 
the items produced have a high probability of being good and as time goes on 
this probability decreases. 

Costs: The value of a good item is one unit and the value of a bad item is 
zero. The inspection is destructive and the cost of inspecting an item is ¢c units. 

Under the above assumptions the average income per unit of time, L, is 
given by the formula: 





[—s — eA(i-R) 


|e — 1] — B - cE(i*) 


e4-Rk) — ] 





iw 
FE(i*) +A 


Where E(7*) is the expected number of items inspected after trouble-shooting 
until a defective is found, and: 


So] 
E(i*) = > e—i4F AF R(1-R?)/(1—R). 
j=0 


The average outgoing quality, AOQ, is found by computing 1—LZ with 
A=B=c=0. 

The two parameters that must be estimated before choosing a plan—+.e., a 
value of F—are A and R. When Aand & have been specified select F to maximize 
L. If A and R cannot be found exactly we can explore the consequences of using 
various values for F by plotting L as a function of F for several combinations 
of the parameters. 

In the course of this paper the preceding assumptions are stated in a more 
general form. It is shown that the assumptions for the production process and 
the quality process have many of the properties needed to describe these 
phenomena. Although the plans considered are simple they have much in 
common with plans now in use. 

The notation used is extensive. It is introduced in those sections indicated 
by an asterisk in the following summary. Section 1, Preliminary Remarks, 
provides an orientation between the proposed and earlier plans. In Section 2*, 
Stochastic Processes, the production model and the sampling plan are formulated 
as stochastic processes. In Section 3*, Fundamental Quantities, important ex- 
pected values related to these stochastic processes are defined and formulas for 
their evaluation are presented. The derivations are in Section 7. In Section 4*, 
Income, the expected values are used for finding the costs and incomes from 
different sampling plans when different models are assumed. Section 5, Be- 
ginning a Plan, gives the steps required before using a plan. In Section 6, 
Examples, the appropriate models and sampling plan are found for two pro- 
duction situations, and the numerical analysis to find the best plan is done. 
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1, PRELIMINARY REMARKS 


The number of proposed continuous sampling plans is large (see particularly 
the survey by Bowker [2]). This makes it possible for the quality control 
engineer to select a plan that will have many of the properties desired in a 
particular situation. 

There are several reasons for using continuous sampling plans. All plans that 
have been proposed insure a desired AOQL, Average Outgoing Quality Limit. 
In part, this results from screening—removal of bad inspected items. While 
using the plans the inspector is kept informed on the state of the production 
process. This helps by indicating when to search for trouble either by examining 
the process in operation, or, in extreme cases, to shut down the process and 
remove sources of trouble. Most of the theoretical work that has been done on 
continuous sampling plans has been restricted to the case where the production 
process does not change over time. Also, most of the plans have not been 
formulated with a clear rule as to when trouble-shooting should be done.! 

Lieberman [10] has considered the results of lack of control on the operating 
characteristics of a plan proposed by Dodge [4].? The production process that 
he considers is the one that leads to the poorest quality, 7.e., gives the largest 
AOQL. His results are concerned only with the screening effects of the plan 
and do not touch on the trouble-shooting aspects. 

Girshick and Rubin [9] have considered a particular type of production 
process’ and formulated a plan that would give the best possible operating 
characteristics among all possible plans. Their plan involves screening. Further, 
their plan has a trouble-shooting rule, so that the process will turn out better 
material when the plan is used than when it is not used. A difficulty with the 
Girshick-Rubin plan is that it is very hard to arrive at the numerical results 
required to put it into operation. 

Savage [22] has discussed a plan that explicitly takes into account the 
trouble-shooting nature of continuous sampling plans. This plan has been 
evaluated only when the production process is in statistical control. Rosenblatt 
and Weingarten [20] have modified the original Dodge plan so that the produc- 
tion process is stopped if more than a specified number of defectives are found 
in a given time unit, e.g., a day. 

The stopping rule in the Girshick-Rubin plan is based on a critical value for 
a likelihood ratio. In the Savage plan [22] the stopping rule is formed as an 
analogue to sequential sampling rules. In the Rosenblatt-Weingarten plan the 
stopping rule is based on the accumulation of evidence over the relatively 
recent past. In this paper, the stopping rule involves evidence from the im- 
mediate past. Studies have not been made comparing these stopping rules. 

Gregory is working on modifications of proposed plans, e.g., Dodge’s [4], 
having the feature that at some point in the cycle of the inspection plan there 





2 The Dodge Plan —(i, F) [4] begins with 100% inspection until i consecutive good items are found and then 
begins sampling ai the rate 1/F reverting to 100% inspection whenever a bad item is found. Bad items are replaced 
with good items. Lieberman’s result is AOQL <(F —1)/(F ++). 

3 In the Girshick and Rubin model [9] the production process states have a geometric distribution. Specifically, 
production begins in the good state, and while there before each item is produced a chance event occurs so that the 
probability of remaining in the good state is 1 --g and the probability of going to the bad state is g. Once in the bad 
state the process remains there until trouble is removed. This model was used by Duncan [7], and Gregory [11, 12]. 
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is an order for the process to be stopped, examined, and brought into the de- 
sired state of statistical control [11, 12]. Gregory’s study involves a simple pro- 
duction process,*? but on the other hand it is possible to analyze fairly compli- 
cated continuous sampling plans under this model. 


2. STOCHASTIC PROCESSES 


There are three stochastic processes required for the description of the pro- 
duction and inspection gestalt. 

I. The inspection process (sampling plan) directs the inspector at each time. 
The states in which this process can be are: 


A. Produce an item and do not inspect it. 
B. Produce an item and inspect it. 
C. Trouble-shooting, no production. 


II. The production process yields the history of the decay of quality between 
trouble-shootings. At each instant of time the state of the production process 
is a number a(t). Larger values of z(t) are indicative of poorer quality. It is 
assumed that 2(¢) is a non-decreasing function of ¢ unless the inspection process 
is in state (I.C.), 7.e., the quality can improve only by trouble-shooting. 

III. Associated with the quality process is the random variable z(t) equal 
to one if the item produced at ¢ is good and equal to zero if the item produced at 
tis bad. It is assumed that the probability distribution of z(t) at ¢ depends only 
on 2(t). Further, given the function z(t), the random variables x(t;), r(tz), «+ -, 
x(t.) are mutually independent. 

In the above it is assumed that production occurs at regularly spaced time 
intervals. For many applications this is a realistic assumption. But for theoreti- 
cal purposes, and because it will fit many other situations, production in a con- 
tinuous stream of goods, such as a wire or linoleum, will also be considered. 

Now consider a restricted class of sampling plans. In these plans stop produc- 
tion at the first sign of trouble, 7.e., stop when a defective item is inspected. 
To describe this process in detail introduce the technical term cycle. By a cycle 
is meant the length of time, and all that happens during that time, between 
the starting of the production until the production (1) has been stopped by 
the sampling plan, and (2) trouble-shooting is completed. The stochastic proc- 
esses are described in terms of cycles. 

Assumption: The time between the beginning of production and the time the 
first item is inspected is a positive random variable, /,. If the first item in- 
spected is bad, stop production. The time for trouble-shooting is a function of 
2(7,) only. If the first item inspected is good the time until the next inspection 
is a random variable, J2, independent of J, but with the same distribution. If 
the second item inspected is bad the process is stopped and the time for trouble- 
shooting is a function of 2(/;+J2). If the second item inspected is good the in- 
spection process is continued in the obvious manner. Use J,, I2, - - - , to denote 
the time between inspections, and S;= >-'_, J;. The basic assumption is that 
the J,’s are mutually independent and identically distributed. Further, the 
time for trouble-shooting is a function only of 2(S) where S is the time at which 
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the process is stopped. The number of random variables, J;, in a cycle is a 
random variable. 


Now consider several sampling plans or alternatively several distributions 
of qi. 


Sampling Plan I. (SP;) 
Pri, = F) = 1, F>0 
When items are produced at regularly spaced intervals, choose F as a positive 
integer.* Then, F=1 corresponds to 100% inspection. 
Sampling Plan IT. (SP) 


In discrete production let Pr (J; =k) =(F—1)*"/F* where F>1. In con- 
tinuous stream production [et Pr (J; <¢) =1—e~‘/?, where F >0. In both cases, 
the expected time between inspections is F. 

Sampling Plan General (SPg) 


Pr(;, < t) = D(t) 


Where D(¢) is a cumulative distribution function which satisfies D(0—) =0 
and fo” tdaD(t)=F </@. 

It is assumed that inspection results are obtained instantaneously and are 
used in making the decision to produce the next item or stop production. It 
is undesirable to inspect at regular intervals, SP;, in situations where unscru- 
pulous workers could make sure the work to be inspected would be good. Under 


SPy the amount of production between inspections is an unbounded random 
variable but the expected amount between inspections is F. 

The function describing the trouble-shooting time when production is 
stopped in state z2(S) is not critical. Assume it has the form A,+A2z(S) where 
A, and Az are non-negative constants. The formula arises from the fixed time 
(A;) to stop production plus a time proportional (A) to the wear [z(S)] for 
repairing. 

Assumption: If production were allowed to go on uninterrupted, the process 
2(t) would have independent, non-negative stationary increments. Specifically, 
if t, tg, - - - , ¢, iS an increasing sequence of times, then the random variables 
2(t;) —2(t;-1) for 7=1, 2, - - -, m are mutually independent, assume only non- 
negative values and the distribution of 2(t;) —2(¢;-1) depends on ¢; and ¢;_; only 
through ¢;—¢;1. As a convention choose ty) =0 and 2(to) =0. 

Production Process States I (PPS) 

The state of the production process is a Poisson stochastic process with 
parameter A, 2.e., 


Pr[z(t,) - z(ts_1) = k] = [exp — A(t; - ts1) | [A(t ae ti_1) ]*/k! 


The values of the differences of the z’s in disjoint time intervals are independ- 
ently distributed. 





4 Some of the notation is unconventional in quality control literature. I have chosen to score a good item 1 
instead of 0 since it fits in better with the discussion of income. Also, P here represents the probability of an item 
being good, rather than bad, as is customary, I have chosen to use F, instead of the conventional f, for the sampling 
rate. 
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The choice of stochastic process for the production process states is more 
flexible and more realistic than the two state model of Girshick and Rubin.* 
In the process considered here there is an infinite number of states and the proc- 
ess keeps deteriorating as time goes on. The process does not make abrupt 
changes from good to bad but instead change can be relatively slow. (The 
process under study and the Girshick-Rubin process coincide when the bad 
state of the latter consists of only producing bad items.) Non-negative incre- 
ments mean that the process goes in only one direction, 7.e., from good to bad. 
The process cannot improve itself. Of course, if there is no corrective action, 
eventually the process will break down and fail to produce. 

Let PPSg denote “Production Process States General,’’ 7.e., any production 
process with independent, non-negative, stationary increments. Only PPS, is 
considered in detail but the introduction of PPS¢ allows symmetric presenta- 
tion of results (Section 3). 


Quality Process 


For a fixed realization of z(t) the qualities (good or bad) of items produced 
at different times are mutually independent and Pr[z(t)=1]=PR*®=1 
—Pr [x(t)=0] where x(t) =1 when the item produced at ¢ is good and z(é) =0 
when the item produced at ¢ is bad. P is the probability of producing a good 
item when the production process is at its best, 7.e., just after production has 
begun. (In most situations the parameter P will be close to 1.) Since R* is a 
decreasing function of 2(¢) quality decreases as time increases. The function, 
PR*™, has the desirable properties: (1) at the beginning of time it is at its 
maximum, (2) as time goes on it approaches zero, (3) if R is near one, the 
probability of a good item changes slowly as a function of z(¢) or of t. 


3. FUNDAMENTAL QUANTITIES 


This section contains expected values required in evaluating the plan. The 
derivations are in Section 7. 

Define the random variable S;= >-4_, I;. S; is the elapsed time until the ith 
inspection. A fundamental random variable is S, the time the first defective is 
found. A general formula for Pr (S>S;,) and then special cases of this formula 
are presented. Nearly all the relevant expected values are found in terms of 
these probabilities. Thus, only general results for the other expected values are 
given since special cases can be evaluated by substitution. 

Definition: S is the time at which the first bad item is inspected. 

I. PPSg¢ and SP¢ 


Pr(S > S;) = Pi]] {E[R#¢>]} fori = 0,1,--- 


j=0 


II. PPS, and SP¢ 


Pr(S > S,) = Pi J] {Efe20-*]} = fori = 0,1,--- 


j=0 
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III. PPS; and SP, 
Pr(S > So) = 1 
Pr(S > S;) = (Pe-4?) igAF R(1—R")/(1—R) for zi 
IV. PPS; and SPy (discrete) 


Pr(S) > So) = 1 


(Pe~4) igh RO-B')/(—-B) 
Pr(S > 8) = — for i 


Il [1 “a eSB) (PF = 1)] 


j=l 


V. PPS; and SPy (continuous) 





Pr(S > S,) = Pi[[ [1+4FQ-—R)}"' fori =0,1,--- 


j=0 
VI. PPSg and SP¢ 
E(S) = F >> Pr(S > S,) 
t=0 


Definition: 7* is the number of items inspected in a cycle. 
Vil. PPSg and SP¢ 


E(*) = > Pr(S > Si) = E(S)/F 


t=0 


Definition: x(a) is the quality of the item produced at time a. If the item is 
good, then z(a) = 1 and if the item is bad, then x(a) =0. 
VIF. PPS¢ and SP ¢ 


E[x(a)] = E[PR*@] 
IX. PPS; and SP¢ 
E[z(a) ] = PE[e-#40-®) | 


Definition: Xo is the amount of good production before the first inspection. 
X. PPSg and SPg (discrete) 


I;-1 
E(X») = Pel > Ro | 


a= 


XI. PPS¢ and SPg (continuous) 


Nh 
E(X,) = PE| f Rd | 
0 


XII. PPS; and SP g (discrete) 
[P — e4@-®)Pr(S > S)] 
[e4-B) -_ 1] 





E(Xo) = 
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XIII. PPS; and SP g¢ (continuous) 
[P — Pr(S > S:)] 
A(1 — R) 


Definition: X* is the amount of good uninspected production before the 


process is stopped, 7.e., before S. 
XIV. PPS¢ and SP¢ 





E(X») = 


E(i*) -— | 
Pr(S > S:) 

Definition: Z* is the state of the production process at the time that it is 
stopped, i.e., Z*=2(S). 

XV. PPS¢ and SP¢ 


E(X*) = mx) 


E(Z*) = AE(S) 
(In general A= E [z(tz) —2(t) |/(t2—th).) 


4. INCOME 


A sampling plan can be evaluated in terms of the income when the plan is 
used. There are many ways of defining income. We consider the average income 
per unit of time 


[Income up to time 7 ]/T 
This is a random variable. We center our interest on 


Limit [Income up to time T]/T 
T-@ 
Girshick and Rubin [9] have shown for models like the ones under considera- 
tion that this limit exists with probability 1 and the limit is given by 


L = E [Income per cycle]/E[Length of a cycle]. 


In short lengths of production the average income per unit of time may be 
quite different from L. However, we are thinking of using sampling plans in 
situations where the total production will be large and hence L is indicative of 
the adequacy of a plan. 

To compute L first define several economic parameters. Assume an un- 
inspected good item is worth 1 unit and a bad item is worth 0 units. (In any 
cost problem it is possible to choose two of the costs arbitrarily and then de- 
termine the other costs in terms of these.) Then, the value of production at ¢ 
is z(t). Let co(e:) be the value of a bad (good) inspected item less the inspection 
cost. With destructive inspection co(¢:) is the negative of the cost of inspecting 
a bad (good) item. In non-destructive testing, sometimes c;>1 when an item 
known to be good can be sold at a premium. Assume the time for trouble- 
shooting in state z2* is A,+-A.2* and the cost is B,+B.2*. Then: 


XVII a [E(X*) a B, <_ B.E(Z*) + (co om C1) op cE (i*) | 
— [E(S) + Ai + AcE(Z*)] 
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E(S) 
E(S) +Ai+A,.EZ* 
= PTDP 
XIX. Proportion of good items produced (discrete case) 
[E(X*) + E(i*) — 1] 
. E(S) 
XX. Proportion of production time of good quality (continuous case) 
= E(X*)/E(S). 





XVIII. Proportion of Time Devoted to Production = 





= 1 — AOQ. 


XXI. Proportion of items produced that are inspected 


= E(i*)/E(S) = - = AFI. 


(Average Fraction Inspected = AFI) 
Formulas X VIII through XXI are special cases of XVII. 


5. BEGINNING A PLAN 


Before using the plans described in this paper one should make sure that the 
production and quality process are, at least approximately, those described in 
Section 2. If these conditions are satisfied the following must be done: 


1. Select or determine 
. The parameter, A, for the production process (sect. 2), 
. The parameters, P and R, for the quality process (sect. 2), 
. The method of sampling and the sampling rate, F (sect. 2), and 
. The economic parameters (sect. 4) 
i. The parameters for trouble-shooting time, A; and Ag, 
ii. The parameters for trouble-shooting cost, B; and B2, and 
iii. The value of a bad (good) item less inspection costs, co(¢:). 
2. If an inspected item is bad the process is stopped and trouble-shooting is 
done; if an inspected item is good the inspection process is continued. 
3. All uninspected production is sold. 


Of course as time goes on new values may be found for the parameters in 1 
(above) and those should then be used. 


6. EXAMPLES 


Two examples are considered in detail to illustrate the preceding formulas 
and to emphasize that although the formulas are not simple enough to allow 
the computation of tables they are in such form that particular problems can 
be analyzed. Only the best plan among those considered is found. This probably 
is not the best possible plan. 

The following idealized example affords an illustration of all of the theory. 
In this example, for a priori reasons, the conditions of the model are satisfied. 
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The production process is one of counting. The census bureau has recorded for 
each of the hundred and sixty million individuals in the country data on a large 
sheet (one for each individual). The sheet contains 1,000 positions and each 
position is either filled or blank. The probability that a position is filled is } for 
each individual in each position. The count desired is the number of positions 
filled in a sheet. A scoring screen is placed over each sheet which makes contact 
with each position, and records the total number of filled contacts. Unfortu- 
nately, the screen, after it has been used, makes mistakes due to the contact 
points breaking. The machine makes a mistake if a filled position occurs under 
a broken contact point. 

The number of broken contact points can be represented by a Poisson 
stochastic process corresponding to z(t). This is a good approximation when: 
(1) the failure times for the individual points are independently distributed, 
(2) only a few failures occur. Let A, the number of failures per unit of time (the 
time to process a sheet), be given by .0001. With a new screen there will be no 
failures, P=1. Finally, R = since the probability of no mistakes when there 
are k failed points is the probability that in the sheet at hand those k positions 
are blank. 

The inspection procedure consists in examining a sheet to see if the count of 
filled positions agrees with that of the machine. Assume the cost of hand in- 
spection is 11 units, the value of a properly counted sheet is one, and of an 
incorrectly counted sheet is zero. Assume the inspection process is instantane- 
ous. Trouble-shooting consists of removing the screen and replacing it with a 
new one. The time required for this operation is 25 units and the cost is 50 
units. Time and costs of trouble-shooting do not depend on 2(S). 

First use sampling at rate F with equally spaced time intervals, 7.e., SPr. 


E(i*) = > [(e~-0001F) i(g-0001F (1-2-5) | 


t=0 


Pr(S > S;) = e~-00006F 


1 — e~-00005(F—1) 





E(Xo) = 


e 00008 am l 
E(X»)[E(i*) — 1] 
Pr(S>8;,) 
FE(i*) + 25 





— 50 — 1 — 10E(i*) 








= — (proportion of time devoted to production) 
1 + 25/FE(i*) 


E(X») [E(*) — 1] 
Pr(S > S81) 
FE(i*) 





+ E(i*) — 1 
AOQ =1- 





A good value for F is 410, where L=.94706. The L function is very flat in 
the neighborhood of the maximum. If for administrative reasons it was more 
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TABLE 24la 








F E(i*) = |Pr(S>S8)) E(S) E(X*) PTDP 





-0149 X10* 
-0557 X 104 
.0572 X 104 
.0586 x 104 
-0600 x 104 
-0615 X 104 
-0728 X 104 
.1419 X10 
.6084 X 104 
-0875 X 104 


.0X10? -491 -99501 
-8 X10? -833 -98118 
.9 X10? .107 - 98069 
.0X10? -465 - 98020 
-1X10? -855 .97971 
.2X10? | 25.273 -97922 
-0 X10? -457 -97531 
.0 X10 -419 -95123 
0X 108 .2168 | .77880 
0X10 -0875 | .60653 


-9736 X10 | . .99754 
.0248 X10* | . -99764 
-0255 X10 | . -99764 
.0263 X104 | . - 99764 
.0270 X104 | . -99765 
.0278 X104 | . .99765 
.0337 X104 | . -99768 
.0673 X104 | . .99782 
.2590 X104 | . .99845 
-4108 X10 | . - 99880 


Sore or PP Owe 
Ne Re Re ee RR 
tO 


























Max L =.947 at F410. 


convenient to use =400, there would be no appreciable loss and the AOQ 
would be slightly better. The AOQ can be made as small as desired by picking 
F small, but this is costly. The expected amount of production before stopping 
using F = 410 is 10,600 which is slightly more than 10,000 the expected waiting 
time for the first contact point to break. The proportion of time devoted to 
production is insensitive to the F-value. 

Since the production between inspections is large, it is reasonable to think 
of this as the continuous case and use SPy (continuous). Then: 


E(i*) > Il {1 + .0001F(1 — 2-4)]" 


i=0 j=0 

Pr(S > Si) = [1 -+ .00005F ]- 

1 — [1 — .00005F |“ 
.00005 


E(Xo) = 





E(Xo) Feed — 10£(i*) — 50 
Pr(S > S,) 
FE(i*) + 25 
[1 + 25/FE(i*)}" 
_ E(X)[E@*) — 1] 
FE(i*)Pr(S > Si) 








TABLE 241b 


E(i*) |Pr(S>3,) E(S) 








E(X*) L PTDP | AOQ 





1.05885 x 104 





| | 

| 35.295 . 98522 | | 1.0289 X104 | . .99764 | .02829 
340 | 31.369 -98328 | 1.0665 104 | 1.0325x104 | . .99766 | .03188 
350 | 30.527 . 98280 1.0684 X10 | 1.0335 X104 | . .99767 | .03267 
400 | 26.950 - 98039 | 1.0780 X10 | 1.0380 X104 | . .99769 | .03711 


10* 3.4627 | .66667 | 2.4627 104 | 1.4627X104 | . -99899 | .40606 

















Max L =.93184 at F 340. 
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The best sampling rate, F-value is about 340 in contrast to 410 for regularly 
spaced inspections. The maximum L is reduced, about one per cent, to .93184. 
Remember, however, in random spacing the consumer is protected against 
periodicities, etc. 

As the second example consider the production situation: Items are produced 
one at a time by a complicated automatic machine. If the machine is left un- 
attended, the quality degenerates slowly. There are many parts of the machine 
having a chance to malfunction, and as time goes on some of these parts become 
imperfect. For instance, cutting edges become chipped and adjustments lose 
calibration. The machine is constructed so that even if some of these minor 
flaws exist it does not necessarily happen that the output will be bad. Thus, 
even if a cutting edge has a chip in it, a defect might only occur if the chip 
hits at the beginning of a cutting operation. Assume the quality of the item 
produced at ¢ is approximately of the form PR* where P=.99, R=.95 and 
z(t) is a Poisson process with A=.1. P=.99 means in the best condition possible 
one per cent of the output will be bad, hence, AOQ>.01. R=.95 means when 
a defect appears in the machine the probability it will cause a bad item is .05. 

Let the value of a good item be 1 unit, the value of a bad item be 0 units, 
the inspection cost be negligible, the inspection be destructive, the trouble- 
shooting cost be 5+ (t), and the trouble-shooting time be 3+2(t). 

Let us analyze this process using SPy (discrete). Discreteness is used because 
the production between inspections will not be large. It is necessary to compute 
the following functions for selected values of F. 


»-1(19)(.05) 


Pr(S > S;) = .99e--! ———— 
1 [1 — ¢~- 1(.05)(F — 1] 


[.99 — e%-°Pr(S > S1)] 
(.1)(.05) 


of (.99e--°1) §g-01(19) (1—. 953) | 


eee ye, Oe ihke lanai 
om Il [1 — eo MI-.99)i( PF a 


gul 





E(Xo) = 


E(S) = FE(i*) 


E (@) — | 
E(X.) E , | 
Pr(S > S, ) 
-LE(S) 
E(X*) — 5 — E(Z*) - E(i*) 


‘waa E(S) +3 + E(Z*) : 


E 3+ | 
E(S) 


E(X*) 
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TABLE 243 





E(i*) = |Pr(S>S8;,) E(S) &(X*) L PTDP 








-9635 -920769 | 74.4525 55 .60829 -4502 -876967 | .252835 
-344526 | .899795 | 86.89052 | 63.705256 | .463298 | .881425 | .266833 
-924465 | .879755 98.111625 | 70.363082 | .465436 | .884504 | .282826 
-616440 | .860588 | 108.493205 | 76.06677 -462644 | .886799 | .298880 
.377966 | .842239 | 118.228827 | 81.022655 | .45713 -888593 | .314696 
-895774 | .791604 | 144.788721 | 93.125566 | .436013 | .892284 | .356818 
-188141 | .659429 | 218.81413 |117.93184 -364645 | .897900 | .461041 


























Maximizing L =.465 at F =25. 


The optimum income per unit of time is less than half of what would be 
obtained if production was all good. For the optimum F trouble-shooting must 
be done once for about every hundred items produced. The AOQ corresponding 
to the optimum F is .28 which for many products would not be tolerable. With 
the information supplied in Table 243 it might be considered desirable not to 
continue production until the process can be improved substantially. 


7. PROOFS 
Define x(0) =1, and remember that So=0 and 2(0) =0. Then by definition 


Pr(S > S,) Pr I] 2(S,) = | 


j=0 


, Si; 2(So), ete so | 


(PR*Si ) So, rey s. |. 


Since 2(¢) is stationary with independent increments, we may for fixed 
So, ee S; use 


2(S;) = > 2’) 


jul 


where J;=S;—S;_, and the z'(/;) are mutually independent with the same dis- 
tribution as 2(/;) and obtain 


Pr(S > S;) = E{E|P'R Lint C-i+02'Up | in +**, T;}} 
e{ TT (er |1)]} 
j=1 


[I £le(PR#'« | 1] 
==] 
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= Il { E[PRia]} 


j=1 


= II { E(PRit»]} 


j=l 
= Pi]] {E[R#»]} 
j=0 
which is J. To obtain II we note under PPS; that 2(J,) has for fixed I, a Poisson 
distribution with parameter J,A. 
Starting with 
Pr(S > S,) = P‘ JJ {E[E(R# | 7,)]} 
j=0 
we do the following computation: 
oa k 
E[Ri@ | I] = Deen (Al) re| 
k=O k! 
= e4h(i—Ri), 


When the condition SP; is added, J,=F. Hence III is obtained from II by 
replacing J, by F thusly: 


Pr(S > S,) = Pi J] [470-24] 
j= 0 


= Pig—AFi(i+1)- Line R?) 


= Pig—AFici+—C—Rt+h ((1—R)] 


which reduces to the desired form. 
Using SP1, (discrete) we have from II and the distribution of J; 


‘ bes ; i 1 
miso 9 =a E(temmenli—Z]” p) 


e—A(I— BR!) 


1 : 
i- [i id (: as *) sens 


and then IV is a simplification. Similarly, using SP; (continuous) we have 


i a e~Ati—R))—t/ F 
Pr(S > S,) = PTI f . at| 
0 


j=0 


af 2 , 17 e141 RI) +1/F] 
= Pi A(1 — R4) + — dt 
: F J ee 
F| A(l — R’) + — 


F 
ua i hee 
= F H{F[aa- x9 +>] 


and hence V. 








j=0 
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The proof of VI begins with 


E(S) = z| i I x(s) |. 


i—1 fms 


Now note that J; is independent of So, ---, Sis, x(So), - > +, 2(Si-a), and 
2(So), -- -, 2(S;-1). Hence 


BS) = DAeuoe| II xs» | 


j=1 j=0 


and thus VI is obtained. 

To prove VII we need only note that *= >02, [[]j-o x(S,) J. 

Results VIII and IX are the same as I and II for the case of i= 1 and a play- 
ing the role of S;. 

To obtain X note that 


Ii-1 


Xo= DL [x(e)] 


a=l1 
and for XI we have 
I 
Xo = f x(a)da 
0 
and thus 


E(Xo) = ff xeaa 


bif ‘2(a)da| I, and z(a) forO<a< a} 


=KE f E[x(a) | I, and z(a) for O<a< haa 


Ih 
= | f PR«'da |. 
0 


Under PPS, (discrete) using VIII, IX and X we have 


1-1 
B(X;) = PE[ > «80-2 


a=l 
Pe-AG-®) 


7 [1 _ e-4-R) | E{l — e~(hi-)40-8) | 





and now using II with i=1 we obtain XII. 
To prove XIII we proceed in the same manner as follows: 


I 
E(Xo) = pel f ea WAda 
0 
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1 — e-ha-Rs 
- pe[——* | 
A(i — R) 


To prove XIV we note that in the discrete case 


B(x*) = BY S| Il xs) |[ + x(a} 


i=0 j=0 a=S oo) 
a) i I’-1 

BE} >| P Il Rxs0 | | Rese > Pr |\ 
t=0 j=0 a=1 


where J‘ is distributed like J;,; and is independent of all random variables oc- 
curring up to and including time S; and 2’(1), - - - , 2’(J—1). Hence 


E(X*) _ B(Xa)E4 } [PREieey)\ 


where >-}_. =0 and where the z‘(I;)’s are independently distributed like the 
2(1;)’s. Whence 


~ i+1 
B(X*) = BCX) 4 D PTT [eRe] 
i=0 j=2 
where >-}_. =1. Now dividing and multiplying by PE[R*“ ] we obtain 


B(x) 4 > po Il [ecrineny 


i=0 j=l 





E(X*) = 
7 asi PER] 
and now XIV follows from I and VII. The continuous case of XIV is done 
analogously. 


To prove XV we note (1) 


zt = ¥ [x(S) - “(S.)]] Ul x(S) | 


j= 


and (2) [z(S;) —2(S;_1) | is independent of everything up to and including S,_1. 
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A SINGLE SAMPLING PLAN FOR CORRELATED VARIABLES 
WITH A SINGLE-SIDED SPECIFICATION LIMIT 


K. C. Span 
Calcutta University 


An economic single sampling plan for a variable having a single-sided 
specification limit whose direct measurement is very expensive or de- 
structive is suggested in this paper. Assuming existence of at least one 
correlated auxiliary variable whose measurement is not so expensive 
and of a known dispersion matrix of these variables, the suggested plan 
is shown to have easy practical application. Further the choice of only 
ene auxiliary inexpensive variable highly correlated with the main vari- 
able is shown to be optimum in most of the practical situations. 


1. INTRODUCTION 


N SOME manufacturing processes measurement of the required characteristic 

to be kept under control may be so expensive that only a few such measure- 
ments can be taken for the sake of economy. A single sampling acceptance plan 
for such a variable having a single-sided specification limit, as is usually em- 
ployed (see, for instance, [1]) may not be economical in practice. This will 
usually be the case when a direct measurement of a variable by an inexpensive 
method is inaccurate or it is expensive or destructive. For example, the length 
of life of an incandescent lamp cannot be determined until it has been destroyed 
by being allowed to burn out. Tensile strength of a metal bar is usually meas- 
ured by stretching it until “necking” begins. Measurement of tensile strength 
is rather expensive and difficult too. However, hardness of metal which is 
usually correlated with tensile strength, can be measured cheaply and rather 
accurately. There would, therefore, be a considerable economy if tensile 
strength could be estimated with a high degree of precision on the basis of its 
hardness and/or any other correlated variables. 

In such situations it will usually be profitable to select a few auxiliary vari- 
ables whose measurements may not be expensive. The relation between the 
main variable y and these auxiliary variables 2, 22, - - -, 2, may be used to 
provide an alternative single-sampling plan in such cases. One such plan is 
discussed below. The suggested plan assumes that accurate estimates of the 
lot standard deviation and covariance between pairs of variables are available 
for each measured characteristic y, 2, - « - , 2%. Since it is unlikely in ordinary 
situations that this information will specially be known on a lot-by-lot basis 
for items to be submitted in the future, the practical implication of this under- 
lying assumption is that the dispersion matrix of the variables y, 1, ---, 2 
is in control and so its value remains virtually unaltered from lot to lot. The 
procedure assumes that a reliable estimate of this dispersion matrix will at first 
be made by taking a relatively large sample of size N from one or more lots at 
the initial stage and that this estimate will then be used in the acceptance 
criterion for future incoming goods as the dispersion matrix for the items of 
each inspection lot. 

The question that will naturally arise here is how to determine the size of 
the first sample so as to ensure for practical purposes that the population dis- 
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persion matrix is known. One simple way of determining approximately the 
sample size follows. By the central limit theorem it is easily seen that each 
element of the dispersion matrix after usual standardization tends to the stand- 
ard normal distribution N(0, 1) when the sample size on which the variances 
and covariances are based increases indefinitely. It is also known that the 
limiting normal distribution can be assumed to hold good for practical purposes 
for a sample of size larger than 50. For such a large sample size the confidence 
limits within which each element of the population dispersion matrix will lie 
can then be easily determined for a given level of confidence and the sample 
size (which will not usually be much larger than 50) can be so determined that 
the approximate limits are as close as are desired. 

Although the dispersion matrix of the measurements of item characteristic 
about the means is assumed to remain practically constant, the means of these 
measurements are assumed to vary from lot to lot. This may happen for many 
manufacturing processes where the relatively important or assignable causes 
of item variation that enter to produce noticeable changes affect only the 
average tensile strength, the average hardness or the average of other proper- 
ties. Such a situation commonly arises due to the effect of such assignable causes 
as the slippage of machine adjustments, incorrect tool or machine settings or 
pronounced tool wear. The dispersions of measurements about their mean 
values, on the other hand, will be produced by the constantly present system 
of minor chance—acting causes inherent in the process—a system that will 
remain approximately the same regardless of any change in the mean values 
and so will keep the dispersion matrix almost the same. Under such circum- 
stances, the dispersion matrix can be accurately estimated from a preliminary 
large sample of size N as suggested above. Thus the dispersion matrix of the 
variables y, 2%, - : - , £» is assumed to be known. The actual model used in the 
proposed plan is detailed in the next section. 


2. MODEL 


The measured quality characteristics y, 2, ---, 2, are all assumed to be 
normally distributed. The dispersion matrix of the variables remain unaltered 
from lot to lot and is assumed to be accurately known from a previous large 
sample of size N of the above characteristics. The population means y,, 
M1, * * *, #e are, however, assumed to be unknown and not under control. It is 
thus assumed that the linear regression of y on 2}, - - - , 2 remains unaltered 
ia the population except for some shifts in their averages. Mathematically our 
model can be expressed as 


k 
Ys = at ps Bitit + eye, (2.1) 


t=] 


where €,:, Zit, =1, - - +, k, are normally distributed random variables corre- 
sponding to the t-th lot submitted for inspection, ¢, is stochastically independent 
of (a, -- +, %), a is an unknown parameter, ((:, - ~~, 8y) are regression co- 
efficients assumed to be known due to our assumption of known dispersion 
matrix of y, %, +> -, 2%. Clearly 
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Var (ey) = o,7(1 — R,?), (2.2) 


where F,? is the population multiple correlation coefficient. 
From (2.1), by taking expectations, we get 


k 
ye = a + DY Biss, (2.3) 
t=1 
whereas for some initial lot with t=0 (say) when the first sample of size N was 
taken, 


k 
Myo = a + ,> Bipio, (2.4) 
t=1 
where pyo, Mico, * * * , ko aYe population means of y, 2, - - - , 2, in the initial lot. 
Estimates of u,. and pio, i=1, ---, k, are known from the sample means fo 
and #;vo, i=1, - - - , k observed for the initial lot. 
Likewise for the ¢-th lot to be inspected for acceptance or rejection, we get 
by (2.3) and (2.4), 


k 
Myt = Byo + px Bi(uit — bio), (2.5) 


t=1 


so that an unbiased estimate of y,, based on jyo, Zivo and Fins, i=1,---, k, 
the observed sample means in the initial anc the i-th lot, is given by 


k 
Yue = Jno + : Bi(Zint — Fino). (2.6) 


t=1 


It is readily verified that 


k k 
Var (Yn:) = Var (Pro -> 8.twvo) + Var ( im Bits) 
t=1 i=l 
oy?(1 — R;,?) + oy’Ri? 
N n 





3. THE SUGGESTED PLAN 


Let Uy, represent the specified tolerance limit such that if the population 
mean yu, of a lot submitted for inspection exceeds the value U,, the lot is re- 
jected; otherwise it is accepted. The suggested plan will imply in most of the 
practical situations that if 50% of the units are larger than U,, the lot is to 
be rejected. Following the notations similar to those of [1] we define 


Uy — by 


K = (3.1) 


Oy 
For many destructive tests it is preferable to use concomitant variables rather 
than test the y-values in the sample. 


Suppose samples of size n are taken from each lot and for the sake of simplic- 
ity the sample averages of the k auxiliary variables corresponding to any lot 
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other than the initial one are denoted by Zin, - - - , Zin Whereas the averages 
derived from the first sample of size N taken from the initial lot are denoted 
by jn, Zin, - > +, Zen respectively. For choosing the sample size n it should be 
remembered that the larger the sample, the better can the distinction be made 
between good and bad lots. But larger sample size will involve greater inspec- 
tion costs per unit for the inspection-lots. The size n is selected so as to consti- 
tute a practical balance between inspection costs and quality protection. If, 
however, some further condition is imposed on the plan (see, for instance, 
Section 4) the value of n may be definitely determined. 

Clearly the statement that a specified proportion p of the population lies 
above the limit U, is equivalent to saying that 


fy = Uy — Kooy. (3.2) 


It is desired to devise acceptance procedures with a suitable n such that, the 
probability of rejecting a lot having p = pi is a (where ais usually taken as 0.05). 
The larger p, the proportion defective, the larger u,; we are interested in reject- 
ing the lot if uw, is large. In our plan o, is assumed to be known; hence, when 
direct measurement of y is only used, the critical region based on samples of 
size n* (say) is to reject a lot if 


Vn° ys U, — A* oy, (3.3) 
where (see [1], p. 125) 
eK Ka 4 
* = ~~ vm : (3. ) 
Also the Operating Characteristic (OC) curve of this problem is 
K,+Vn*(Kp—Kp, ) e—#2/2 
L,* f * —— dt. 
= /24r 
The suggested acceptance sampling plan is to reject any lot for which 


Y, + Ao, > Uy 


and accept it otherwise; here instead of (2.6) we write 


k 
Y, = in + D BilFin — Fin). (3.7) 


t=1 


Our main problem is to so determine the value of the constant \ that the pre- 
scribed risk is controlled. Clearly for the suggested plan A will in general be 
different from A*. The actual value of \ can be derived from the following 
considerations. 

As stated earlier, Y,+-Ao, obeys a normal distribution with 


k 
mean = py + Agcy = Myo + p» Bi(ui — Mio) + Ady, 


t=1 
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R,? 1 — R,? 
coats ofp exaeniewme 


= gyn 3.9 
= N oy mn ( ) 


variance = a,’ | 


(say), where 
R22 1-—R,? 
1/n = — + — 0< R,’? <1, (3.10) 
n N 
and py, ui’s and R, have the same meaning as defined in the previous section. 
By (3.10) it is clear that n, will always lie between n and N, though it will not 
usuauy be an integer. 
For controlling the prescribed risk a, we should have, 


P'Y, +0, > U,|p = pm] =a. (3.11) 


This implies 





that is, 





— ro, — Uy + aad — 
Ty 
by (3.2). Hence 
h = Ky, — Ka/Vm. (3.12) 


The probability L, of accepting a lot with a given proportion defective p, 
that is, the OC curve of the suggested plan, is 


Ka+V¥n(Kp—Kpi) g—@/2 
L f = dt. 3.13 
, V2e ~— 


Hence the larger the n, the better will be the suggested plan. By comparing 
(3.5) with (3.13) it is seen that the usual sampling plan [1], when only direct 
measurements on y are used, will be equivalent to the suggested plan only when 
n* =n, and where n, lies between n and N. 

In case more than one auxiliary variable such as %, - - -, x; are at our dis- 
posal the next problem worthy of solution may be to decide which subset from 
(a1, -* +, %) may be neglected without affecting much the steepness of the 
OC curve. 

The expression (3.13) for L, shows that larger values of m as defined in 
(3.10) should be preferred. The actual value of n, will depend on n, N and R,?; 
of these three the first two are assumed to have been selected from other prac- 
tical considerations and the only choice left is to the value of R,?. However, the 
behavior of m with R,? will be different for the following three cases: 


(i) n<n<N 
(ii) n=n N 
(iii) N<m <n. 
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The case (i) is likely to be the most common one, since for the validity of our 
model we had to assume N to be a large integer. In this case 


N N : 
—=1+4R#(—-1), (3.14) 
mM n 

so that m,; will be nearly equal to its maximum possible value N when R? is a 
very small quantity. But a very small value of | R;| will suggest that the model 
assumed above is a doubtful one and as such the suggested procedure cannot 
be much relied upon in practice. Thus although the smaller value of R,? is 
desirable and can be sometimes achieved by choosing only one out of the given 
k auxiliary variables, yet the actual choice of the variable x should be such 
that it is significantly correlated with y and also that it is cheap to measure. 
The aforesaid dependence of N, on R,? may at first sight appear to be a peculiar 
result. The main reason for such an odd result is that when N is large relative 
to n, the contribution of 


oy°(1 — Ry?) 
N 


’ 


to the variance of Y, due to departure from the regression line becomes 
negligibly small when compared with that of the other term o,7R,2/n, so that 
wide departure from the regression line becomes of less practical importance. 

In the second case (ii) when n= WN, a very simple result is arrived at, viz., m 
becomes independent of R,?; but as in the previous case in this case also the 
validity of our model necessitates selection of one auxiliary variable zx signifi- 
cantly correlated with y which is inexpensive to measure. The choice between 
cases (i) and (ii), i.e., whether m should be less than or equal to N, will depend 
on the fund at our disposal and on how expensive the variable z is to measure. 

The remaining case (ili), when n>JN, will be practically the least important 
owing to higher cost that has to be incurred; we get 


— we R,? : 1 5 
see 8 (al — a(=- ), (3.15) 


nN 


so that n; becomes larger as R,2 becomes larger, and is equal to n when R,? is 
equal to 1. Although larger values of R;? will lead in this case to a steeper OC 
curve, yet this improvement will not be of much practical importance since 
we had to assume WN to be a large number for the validity of our model, and 
therefore, n, will be large even when R,? is assumed to be a small quantity. 
Further, in case the value of R,’ is enhanced by choosing the number k of 
auxiliary variables large, the suggested plan will become somewhat expensive 
owing to measurement of all these k variables. Hence the choice of only one 
auxiliary variable z highly correlated with y and whose measurement is inex- 
pensive will be optimum in the present situation. 

From the above study the following optimum procedure suggests itself. The 
sample size N to be taken from the initial lot is to be determined from practical 
considerations such that the dispersion matrix can be estimated quite ac- 
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curately. Only one auxiliary variable z having significant correlation with y 
whose measurement is the least expensive should be chosen. Unless the cost of 
measurement is prohibitive, the second sample size n of x to be taken from sub- 
sequent lots may be made equal to N, in which case the suggested plan becomes 
very simple to adopt. However, if choice of this size n=N becomes too ex- 
pensive, a smaller n may be chosen from other practical considerations so as to 
balance between inspection costs and quality protection. 


4. RELATED PLANS 


4.1. In the above plan we noticed that the choice of the second sample size 
n could be made somewhat arbitrarily. If, however, some further restriction is 
imposed on the plan some definite value of nm may have to be selected. For 
instance, suppose it is desired that the acceptance plan should be such that 
for two specified values of p, vis. p,; and pe, the probability of rejecting a lot 
should be a and 1— respectively. Thus p; and pz may correspond to the usual 
Acceptable Quality Level (AQL) and Lot Tolerance Percent Defective (LTPD) 
with a=.05 and 8=.10 respectively. As before it is assumed that the larger p, 
the proportion defective, the larger u, and we are interested in rejecting the 
lot if uy is large. In this case the acceptance sampling plan such that any lot is 
rejected for which the inequality (3.6) is satisfied, may be considered to involve 
two unknown constants and n which are so chosen that the prescribed risks 
a and @ are controlled. It is readily seen that here A and n should satisfy the 


equations 
R,? 1 — R,? 1/2 
ap) 
| n N 


R2 1—R,?7 
on Be seen hs 
o| + | 


so that the solutions \’ and n’ (say) are given by 


od K.K>, ing Ki-sK 5, 
Ka Pa Ki 


rg 








’ (4.1.3) 


Ul 


Re? Ss ~ =| 1— Re 


n ' # —_ Kis N 


Likewise for the single sampling plan involving direct measurements on y, 
when ga, is assumed to be known from a preliminary large sample of size N, 
the A* and n* of the plan suggested in Section 3, should satisfy [1] 

_ KaKy, — Ki-sKp, 
Ka nat Ki_s 





\*’ , (4.1.4) 


which is identical with \’. Also 
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1/n*’ i Ga wah =) 
K.— Ki-s 


1-R2 Re? 
——— + —» by (4.1.3). 
n 


N 
Since O< R2<1, we get 
n' <n <N (4.1.6) 
under the realistic assumption n’<WN already discussed. 
If C and c; are expenditures involved per item in recording y and the auxiliary 


variable z’s respectively, the total cost involved in inspecting a single lot when 
auxiliary variables are not taken into account may be taken as 


T*=(N + n*’)C, (4.1.7) 
whereas the corresponding cost to be incurred under the suggested plan will be 
T’ = N(C +c) + n'cx. (4.1.8) 


Both the procedures are equivalent in the sense that both of them satisfy our 
requirement regarding AQL and LTPD. The suggested procedure will clearly 
be preferable only if 


(N +n’). < n*’C, 


that is, when 
, 


C N n 
—P itl tO + are (4.1.9) 


Ck n! 


The right hand side of this inequality will usually be much greater than 1 if 
N>n’. Hence the suggested plan is superior only if direct measurement of the 
variable y is very expensive as compared with that of correlated auxiliary 
variables. 

However, as is usually the case, if more than one, say v lots are to be in- 
spected for acceptance by sampling, then the total costs involved in inspecting 
the v lots besides the initial lot under the two plans will be given by 

T* = (N + n*’r)C, (4.1.7’) 
and 


T’ = N(C + cx) + n'vex, (4.1.8') 


so that 7’ is less than 7* only when 


C N n! n’ 
—>—— + Re(1- =) 4 


Ck vn*’ N 


[1 ~lfee(: “+ 4.1.9 
me cme”, cis nN) 'N sani 


l 1 N n’ 
= | a(t _ -) + - | + R,? + (1 — R,?) —- 
v v vn’ N 
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For v=1 the inequality (4.1.9’) coincides with (4.1.9) as it should be. It should 
be noticed that the right hand side of (4.1.9’) is a decreasing function of v so 
that when v>1 the relative cost C/c, of measurement of the main variable y 
when compared with that of z’s need not be so high to make the suggested plan 
superior to the usual one, as was necessary for the case y=1. Further, when » 
is very large, T* and 7’ defined by (4.1.7’) and (4.1.8’) will be approximately 
equal ton*’vC and n’ vc, respectively. Since n’c, will usually be less than n*’C, T* 
will be larger than 7’ in such situations. Even without assuming this explicitly 
it is seen from the r.h.s. of the inequality (4.1.9’) that when + is indefinitely 
large, T’ will be less than 7* whenever 


C/e > | + "gee Re) | = n/a”. 


4.2. An acceptance sampling plan closely similar to the one studied above can 
also be derived for a slightly different problem. Suppose that the population 
dispersion matrix for the variables y, x, - - - , 2, is known from a large number 
of observations on these variables collected previously from a large number of 
samples taken from the inspection lots. The sample size n to be taken from each 
such inspection lot is assumed to be given from other practical considerations. 
This problem is different from the problem already considered in the previous 
sections only in that here no large sample of size N is considered, yet the popu- 
lation dispersion matrix is assumed to be known. The solution to such a prob- 
lem will be simpler as this plan becomes virtually independent of R,; the re- 
quired acceptance plan can, in fact, be easily derived from (3.6) by taking NV 
equal to n. Such a simple plan may be useful in the following situation. Suppose 
the usual acceptance-plan [1] involving n observations on the main variable y 
alone, is employed during the initial stage, but some inexpensive auxiliary 
variable z’s are also recorded for each such sample item for some later use. 
When the data thus collected become sufficiently large so as to provide a reli- 
able estimate of the population dispersion matrix, only the inexpensive auxili- 
ary variable z’s are recorded for sample items from subsequent lots. Clearly 
the aforesaid plan is applicable in this case. The main defect of such a plan lies 
in the fact that here information available from a large number of observations 
on z’s taken from the inspection lots during the initial stage is wasted, so that 
the plan becomes rather an expensive one. 


5. ILLUSTRATION 


5.1. An item of annealed steel is received in large quantities. For an item to 
be acceptable, its tensile strength (y) must not exceed U,=100 thousand 
pounds per square inch (psi). The receiver decides that a sampling plan giving 
an acceptable quality level of 1 per cent defective will be appropriate. Measure- 
ment of tensile strength is known to be more expensive than that of hardness 
(x), which can be cheaply measured in Vickers hardness numbers. In view of 
this the receiver desires to use the suggested acceptance plan. The receiving 
inspector groups the product into inspection lots of 500 items. From the initial 
lot he selected at random a sample of N = 100 items to get a reliable estimate of 
the dispersion matrix of the variables y and x, which is assumed to remain in- 
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variant from lot to lot. The items in this large sample are tested for tensile 
strength and hardness both and the data collected are utilized to provide a 
linear relation between the variables y and z at the initial stage. Calculations 
from this group of 100 pair of readings showed jy =80 (000) psi, #y=170, 
oz=15, cy =8 (000) psi, 8=0.48 so that correlation between z and y is equal to 
.90. 

(a) From any lot after the initial one, only n=20 items are sampled at 
random and only hardness (x) of each such item is recorded. Mean , is then 
calculated for such a lot. 

(b) Y, may be evaluated from the equation 


Y, = jn + B(Z, — En) 
= 80 + 0.48(z, — 170) 
= 0.487, — 1.60 
for each lot, the unit in which tensile strengths are measured is taken as (000) 
“ The standard deviation o, is to be multiplied by a constant \= 1.988 
which can be evaluated for the suggested plan from the derived formula: 


A= K,, — Ka/Vm, 


where 


1 — R,' 


81 19 
anaes hl cine 

20 100 

.0424, 
p~i=.01, a=.05 and K, is the upper 100 p % point of the normal distribution. 
Hence by the probability integral table 

A = 2.3263 — 1.6449/4.856 
= 1.988. 


(d) Y, and Ao,= 15.90 are added together and if their sum 
Y, + Aoy = .48%, + 14.30 


is greater than U,=100 (000) psi or, in other words, where Z, > 178.54 the lot 
is rejected. Otherwise the lot is accepted. The procedures (a) and (d) are re- 
peated for each lot. 

For instance, if by (a), the average hardness in two inspection lots from a 
sample of size 20 are found to be 175 and 185 respectively, then by our sug- 
gested plan the first lot is accepted but the second lot is rejected by (d) since 
175 <178.54 whereas 185> 178.54. 

5.2. Next suppose that the sample size n to be taken from any lot after the 
initial one is not specified, but the receiver desires that his plan should accept 
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a lot having 4 per cent defectives with probability less than or equal to .10 in 
addition to his requirements in 5.1. 

In this case as shown in the previous section the values of \ and n in the . 
suggested plan should be evaluated from the formulas 
al K.K>, ia Ki-sK,, 

Ka ea Ki 


x’ 





= 2.00, 


since here 


= .05, 1.6449 
10, = — 1.2816 
01, = 2.3263 
04, , = 1.7507, 


n’ 


: - » Where R, = .90, 
Ka ad Kis 


Re (En - En | es 
7 N 


so that 
n’ = 22.01 


Hence samples of size 22 should be taken in the acceptance plan. Under this 
plan a lot is to be rejected if Y,,+A’o,> Uy, i.e., if .48 Z1n-+14.40 is greater 
than U,=100 or when Z, > 178.54. 

The suggested plan will be superior to the usual plan when only direct 
measurements on y are used for any inspection lot if 


, 


n 
N 


Cc N 
—>1+R?—+(1- RB) 
n 


Ck 


= 4.72, 


that is, when the tensile strength is 4.72 times more expensive than the hard- 
ness to measure. However, if C/c, is known to be equal to 2, the minimum 
number » of lots to be inspected to make the suggested plan superior, will be 
given by the formula 


1 n' 


1 N 
v v vn N 


i.e., when »>3.37. Thus at least 4 lots have to be inspected in order to make the 
suggested plan superior in such a case. 


6. MISCELLANEOUS COMMENTS 


6.1. The plan suggested in Section 3 will be superior to the corresponding 
single sample plan for y [1] only in the long run when a large number of lots are 
to be inspected, since the procedure demands a large sample of size N has to 
be collected for both the expensive variable y and rather inexpensive auxiliary 
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variable x at the initial stage, whereas the usual plan utilizing y alone will re- 
quire measurement of only one variable y. Section 4.1 too reveals this point. 

6.2. Occasional checks about the validity of the assumed model, especially 
whether the dispersion matrix remains unaltered and only the lot averages 
vary from lot to lot, should be made; and the modification in the linear relation 
between y and z, if noticeable after some time has elapsed, should also be taken 
into account in the above mentioned plan. 

6.3. For the closely analogous problem when a lower specification limit L, 
is given such that an item is considered defective if it lies below L,, the modi- 
fications required in the plan as suggested above are quite obvious. 

6.4. The model assumed in the suggested pian would have been more realistic 
if it were assumed that the population dispersion matrix of the variables, 
although remaining the same, can only be estimated from the samples taken 
from the lots. Here we have to assume that the population dispersion matrix 
is unknown and its best estimate available is the sample dispersion matrix. 
The acceptance sampling plan in such situations can be derived by proceeding 
in an analogous manner but as the algebra involved in this problem gets much 
more complicated, this case has not been studied in this paper. 
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MINIMUM RISK SPECIFICATION LIMITS 


F. H. Tineny anv J. A. MERRILL 
Phillips Petroleum Company 


In determining specification limits for items of product to be sub- 
jected to 100 per cent inspection both process variation and measure- 
ment error must be considered. Often costs associated with misclassifi- 
cation are of importance. These costs depend upon the nature of the 
misclassification and the amount by which an accepted item is defective 
or a rejected item is acceptable. The costs, with the process and meas- 
urement variations, form the basis for defining the risk to be ideutified 
with the inspection scheme. Mathematical formulas are derived and 
tables given which facilitate the construction of minimum risk specifica- 
tion limits. A variety of assumptions about relations among costs and 
relations between inherent process variation and measurement error 
are covered. 


1. INTRODUCTION 


ropvucTs of a repetitive manufacturing process are subject to random vari- 
ens as are the measurements of the products. Some quality characteris- 
tics can be measured essentially without error. When product specifications 
are written and rejection limits established with respect to such characteristics 
the inherent variation from item to item (process variation) should be consid- 
ered. When the inherent variation is ignored, the limits tend to be too tight. 
As a result the specifications are branded unrealistic by both the manufacturer 
and the receiver and ignored by both. Other quality characteristics have ap- 
preciable measurement error. This additional variation complicates writing 
meaningful specifications and establishing rational rejection limits. 
Developed in this paper is a method for establishing rejection limits for use 
in a screening (100 per cent) inspection program which minimizes the total risk 
associated with the misclassification of an item. This risk is based upon the 
economic consequences of misclassification as well as process and measurement 
variations.' 
2. DEFINITIONS AND ASSUMPTIONS 


In classifying items of product as defective or nondefective with regard to a 
given quality characteristic when measurement error is present, the following 
two classification errors are possible: 

1. An item may be classified as defective when actually it meets specifica- 

tions. 

2. An item may be classified as nondefective when actually it does not meet 

specifications. 

For any given true value zx for the quality characteristic a certain economic 
background usually can be associated with the classification. This background 
results in: 

1. A certain loss incurred if the item is rejected when, in fact, it should be 

accepted. 





1 The problem which initiated this study was in conjunction with writing specifications for reactor fuel elements 
for the Materials Testing Reactor and the Engineering Test Reactor at the National Reactor Testing Station. 
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2. A certain loss incurred if the item is accepted when it should be rejected. 

The total risk associated with the inspection is defined as the expected value 
of the conditional expected loss resu!ting from misclassification, the condition 
being that the quality characteristic is a given value x. The expectation is with 
regard to the distribution of x. The conditional expected loss for given z is 
simply the sum of the products of the probabilities of misclassification by the 
respective losses.? 

In order to determine the minimum risk specification limits in any given case 
the following definitions are needed: 

A, and w,=the lower and upper technical limits, respectively. In the 
absence of measurement error, these limits usually define the 
specification limits. The limits need not be symmetrical with 
respect to the process mean. 

A, and uw,=the lower and upper specification limits, respectively. 

C.(At, Mt; ¢p, ) =the loss incurred in accepting an item whose quality is z. 
C,(A1, Mt, Gp, £) =the loss incurred in rejecting an item whose quality is z. 
o,=process standard deviation. 
om = Measurement standard deviation. (If the average of n inde- 
pendent measurements is used, ¢, must be divided by-/n to 
use the table and formulas of this paper.) 
u=process mean. 

It is assumed that both true quality and measurement error are normally 
distributed. 

It is convenient to express A; and yu; as deviations from yz in units of ¢, and 
similarly deviations of \, and uw, from d; and yp; respectively, in units of om. 
Thus, k; and ke are defined by the relationships 

rz =se- kiop 


(1) 


Mt ua + keoy 
and }; and be by 

As _— Az + biom, 

Ms = Mt — boom. 


(2) 


The inspection rule is simply: “If the measured value for the quality of an 
item falls between \, and yu,, the item will be accepted, otherwise it will be 
rejected.” 

Table 267 gives values of b; and b for various ratios r=¢,/om and loss func- 
tions 


C(Az, Mt, Tp; xr) = om (* 


= 0 if\e Sr Sm 


rt pM 
= ox ( *) if > we (3) 


Tp 


x 
) if z <X, 


Tp 





2 Grubbs, Frank E., and Coon, Helen J., “On Setting Test Limits Relative to Specification Limits,” Industrial 
Quality Control, Vol. 10, No. 5, March 1954, 15-20, considered a similar problem; however, in their treatment 
limits were assumed to be symmetrical about the mean and losses were assumed to be constant in the region of 
misclassification. The notation used in this paper is essentially that of Grubbs and Coon. 
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Cy(Az, Mt, Tp, 1) = Cy fA; St Sum 
= 0 otherwise. 


Asterisks in the table indicate, in essence, that all items submitted should be 
rejected. Two additional cases are considered in Section 4. These are charac- 
terized by loss functions 


Ce(At, Mt Fp, 7) = Co if x < Xz 
fA. Sars 
if z > pr, 

Cy(At, Mt, Fp, T) = C if. Src 


otherwise 


C.(rt, Mi, Tp 2x) - if zx < Me 


wt — Me ‘ 
= cxo(=—*) if z> ty 
Tp 


Cy(rz, Mt, Tp, x) = Cs if rz < 1 < Me 
= 0 otherwise. 


The constants, C.”, C. and C, are simply unit or integrated losses, as the 
case may be, in the respective regions of misclassification. 
3. ILLUSTRATIVE EXAMPLE 


The following large sample estimates have been obtained for the U-235 con- 
tent of the Materials Testing Reactor fuel elements: 


uw = 200 grams 
o, = 2 grams 
om = 2 grams. 
The technical limits resulting from reactor physics calculations have been set at 
A; = 194 grams 
we = 204 grams. 


From consideration of reactor hazards, the cost ratios reflecting the economic 
consequences of misclassification are estimated to be 


C/C, = 2 


C/C, = 10. 
By equation (1) we compute 


200 — 194 
ky =- =e <p 


ke 
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For this application r=c,/¢mn=1. We assume the loss functions as given 
by equations (3), so the table applies. In particular for ki=3, k2=2, r=1, 
C/C,=2, C./C,=10 we find® b; = —2.917 and b2= —0.819. From (2) the 
minimum risk specification limits are found to be 


A, = 194 + (-—-2.917)(2) = 188.2 
bw. = 204 — (—0.819)(2) = 205.6. 


4. MATHEMATICAL FORMULATION 


We assume the distribution of quality from item to item is normal with mean 
uw (assumed to be zero without loss of generality) and standard deviation a>. 
For given quality z, measurement errors are also assumed to be normal with 
mean zx and standard deviation o,. We supplement the definitions of Section 2 


by 


$(t) 


VJ/2r 
?(t) = f o(z)dz. 


The conditional expected loss resulting from misclassification, the condition 
being that the quality characteristic is x is 


R(x) = G(x)-C.(rs, ms, opt) + [1 — G(x) ]CpAs, ws, op, 2) (6) 
where 


(7) 


The total Risk, PR, associated with the inspection scheme is defined as the 
expected value of R(x), 7.e. 


- 1 z 
R -f R(x) — (=) dx. (8) 
cae Tp op 


For the loss functions given by (3), the total risk defined by (8) becomes 


Xe A; -— 2 ] Dy at l x 
f cw (“—*) G(x) — o ) dx +f C,f1 - G(x) | - o( Var 
Tp Tp Tp Ne Tp On 


Y 2 (z = $ Y I v 
+f c.(——) ai) — o(—)ar. (9) 
/ ut \ Tp J Tp Tp 


By the change of variable, 
v = x/a, in (9), the definition r = o,/om, and reference to (1), (2) and (7) 
we obtain 





3 It is to be noted that the tabulation for h =2, kz =3, r=1, was used. However, in so doing the labels on the 
cost ratios were interchanged as well as the labels on the b’s. 
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R=- Cr ‘(ey + v){[(k_ — v)r — by] — [bi — (ki + v)r] } ov) dv 


—o 


+ ¢, f {1 — &[(ke — v)r — bo] + S[di — (hi + v)r] }p(v)dv (10) 


+0. [ (0 — ha) { [Ge — 0) — ba] — &[bi — i + 0)r] }o00)ad. 
ke 


The condition imposed on (10) to arrive at specification limits is that this func- 
tion be a minimum, the minimization being with respect to b; and be. By equat- 
ing the partial derivatives of (10), with respect to b; and be, to zero and simpli- 
fying we have the two equations 


Cc. 
w(u;) = C fus[1 — &(us)] — o(u,)} 


en 
+ Cc. {(K 7 u;) [1 - o(K = u,) | _ o(K = u,)} (11) 
+ Vr? + 1[6(K — us) + (us) — 1] = 0 


where i= 1, 2, 
ky + rhy 


VP +1, 
a r°(ki + ke) + ki — rbs 
Jeri 


(12) 





(13) 


and 
K = (ky + ke)Vr? + 1. (14) 


In any given case the solution to the pair of equations given by (11) can be 
effected in a trial and error manner. In particular, two values for u; solve (11). 
If these in turn are substituted in (12) and (13) to obtain b; and be, respectively, 
the specification limits defined by (2) can be obtained. The choice of which root 
of (11) to equate to «w and which to uz can be made in an arbitrary manner 
since the same ultimate specification limits as defined by (2) will result. In one 
case, however, they will be incorrectly labeled. 

An excellent approximate solution to (11) which holds over a considerable 
range of values of the parameters of the equation (K>3 and pertinent cost 
ratios less than 10) is obtained by expanding the function 


C a 
g(ui) = — [us(1 — &(us)) — o(us)] + Vr? + 1 O(u,) (15) 


Pp 


into a Maclaurin series and the function 


C,.( 
h(K — u,) = — {(K — u)[1 — &(K — u,)] — 6K — uy} 


+ vr? + 10(K — u,) sata 
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into a Taylor series about K and noting that (11) is 
w(us) = g(ui) +h(K — ui) -vVrPr+1=0. 


If terms up to and including the quadratic in the series expansion are retained 
the coefficients 


14g Cc Ce 
a, Ga? 


com ¢.@ ; se 
p= ~~ 90] + VP FI{ - 00} (18) 


(kK) — KeiK) vi Fat 17) 
/24r 


Co Co 
Cafe Cp 


{K[1 — ®(K)] — 6(K)} + Vr? + 1[®(K) — 3] (19) 


in the quadratic (Au,+Bu;+C=0) representation of (11) result. A further 
simplification can be effected by noting that in almost any practical case, K as 
defined by (14) will be greater than 3 and hence 


&(K) =1 
and 
o(K) 0. 


In applying the quadratic formula to effect a solution for the u,’s it is to be 
noted that since the validity of the approximation depends upon the proximity 
of the solution to zero that root which is closest to zero is to be retained and 
used to solve either (12) or (13)—(not both)—for the corresponding };. The 
second root can be approximated to (and hence the other b; determined) by 
observing the symmetry of the fundamental equation, (11), in the cost ratios 
C./C, and C,/C, coupled with the variables u; and K—u,;. Thus an inter- 
change of C./C, and C./C, in equations (17, (18) and (19) and the applica- 
tion of the quadratic formula to the resulting coefficients to solve for the root 
closest to zero in the variable K —u; results in the desired solution. 

Table 267 gives solutions of (11) for selected values of the parameters. These 
values were obtained by first approximating to the solutions in the manner de- 
scribed above and then proceeding in a trial and error fashion to the more exact 
solutions. 

For the loss functions given by (4), the total risk as defined by (8) is 


—ky 
R= ow f { &[ (ke — v)r — be] — b[(b: — (ki + v)r]o(v)dv 


ke 
+C, {1 — &[(ke — v)r — bo] + &[bi — (ka + 0)r] } d(v)dv 


—k; 


+ Ce ff {8 [(ba — vr — ba] — [br — (ha + 0)r] J 6C0)ar (20) 
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and the minimizing equations are 


C 
ae [1 — #(K — us] + 
Y 


5 su 


— (@(K —u) + @u) —1]=0 i=1,2. 


(21) 


Approximate solutions to (21), found in the manner described above are 


Cm 


By “trial and error” starting from these approximate solutions, the actual solu- 
tions of (21) in any given case can be obtained. Application of (1), (2), (12), 
(13) and (14) results in the desired specification limits. 

In many applications, only a single technical limit is of any consequence. 
Such might be the case in reactor fuel elements since the consequence of accept- 
ing an element which is light in fissionable material is negligible compared to 
accepting one having considerably more fuel than the nominal amount. Thus 
for loss functions given by (5) the minimizing equation is obtained from (11) 
by letting k;= © and C./C,=0. The solution for ue is therefore obtained from 


C (2) 


{él — |] —o@} + Vr? + 1e() = 0 (24) 


-y 


where 


since 


and 


under these conditions. 





TABLE 267 


FACTORS FOR COMPUTING MINIMUM RISK SPECIFICATION LIMITS 
k=1.0 and ke=1.0 


r=1.0 








CoN /Cp 


¥ 








| 1 |—1.053 
—1.053 








| 2 |—1.053 
| |—0. 683 








—0.262 
—0.262 





—0.248 
0.231 














0.266 
0.266 








Ce/Cp 5 |—1.053 
|—0. 184 


| 10 |—1.053 
| | 0.192 
100 |—1.053 

1.332 








1 








—0.184 
0.192 


—0.184 
1.332 


r=8.0 


0.192 | 
0.192 | 

| 0.192 | 1.34 
1.334 





Col/Cp 











—1.050 
—1.050 


1 








—1.050 
—0.720 


—0.720 
—0.720 


2 








—1.050 
—0.267 


—0.720 
—0.267 


—0.267 
—0.267 


IC/Cp 5 


—0.468 
—0.468 








—1.050 
0.081 


—0.720 
0.081 


—0.267 
0.081 





—0.468 
—0.130 








—1.050 
1,179 











—0.720 
1.179 





—0.267 
1.179 




















bi=upper entry. 


be=lower entry. 


r=1.0 


ki =1.0 and kh. =1.5 








—0.468 

















|—0.130 


0.977 | 0.977 


r=2.0 








2 


Ce Vo 








—0.916 
—1.892 


—0.284 
—1.892 





—0.916 
—1.416 


—0.283 
—1.416 


10 
0.185 
— 1.892 
0.186 
—1.416 


Col Cy 





—1.184 
—1.303 





—0.184 
—0.933 








—0.916 
—0.783 


—0.283 
—0.783 


0.188 
—0.781 


Ce /Cp 


—0.184 
—0.374 








—0.916 
—0.314 


—0.281 
—0.312 


0.191 
—0.309 





—0.910 
1.115 





—0.258 
1.143 








0.263 
1.199 





—0.184 
—0.058 

















—0.184 
1.082 

















r=4.0 


r=8.0 








1 


Cc. W/¢, 
e 2 Pp 


c.9/¢ 
5 











—1.050 
—1.175 


—1.050 


—0.846 


—0.720 
—1.175 


—0.267 
—1.175 





—0.720 
—0.846 





—1.050 
—0.392 


—0.720 
—0.392 


|—0. 267 
—0.846 


—0.267 
—0.392 


|—0.392 


—0.468 
—1.268 


—1.268 





—0.468 
—0.960 


—0.130 
—0.960 





—0.468 
—0.530 


—0.130 
—0.530 








—1.050 
—0.043 


—0.720 
—0.043 


—0.267 
—0.043 


0.081 
—0.043 


—0.468 
—0.193 


—0.130 
—0.193 








—1.050 





—0.7 


—0.267 
1.054 








20 
1.054 | 1.054 


0.081 
1.054 

















—0.468 
0.914 








—0. 130 
0.914 











bi=upperentry. b:=lower entry. 











i 
TABLE 267—Continued 
ky =1.0 and kz =2.0 


r=2.0 








CoC 





—0.184 
—1.553 


—0.184 
—1.183 


CeP/Cp 5 |—1.392 |—0. . ; : D/ .683 |—0. 184 
11.285 |—1.: : 3 Y 684 |—0.684 


— 1.392 : , \ ‘ d 6 —0.184 
—0.819 , , ’ . 3 ls —0.308 


—1.392 , ° R . ° ; —0.184 
0.576 5 , ° . . le 0.832 






























































r=8.0 








CeCe 


| 100 1 2 10 











-0.7 1.17 
1. 1.300 |—1. -300 |—1. 300 —1.330 |—1.330 |—1.330 |—1.330 


73 
3 





—1.050 —0.720 


—0.267 | 0.081 1.179 ~1.205 |—0.897 |—0.468 |—0. 130 
—0.970 |—0.970 


—0.970 |—0.970 —1.022 |—1.022 |—1.022 |—1.022 


| 

20 |-0 267 | 0.081 79 —1.205 |—0.897 |—0.468 |—0. 130 
+ a 

| 

| 


—0.970 


| 





—1.050 |—0.720 |—0.267 | 0.081 | 1.179 2), —1.205 |—0.897 |—0.468 |—0.130 
—0.517 ee —0.593 |—0.593 |—0.593 |—0.593 





—1.050 |—0.720 |—0.267 | 0.081 | 1.179 —1.205 |—0.897 |—0.468 |—0.130 
—0.168 |—0.168 |—0.168 |—0.168 |—0. 168 —0.255 |—0.255 |—0.255 |—0.255 


| | 


—1.050 |—0.720 |—0.267 | 0.081 | 1.179 100 |—1.205 |—0.897 |—0. 468 rem 

















0.929 | 0.929 0.929 | 0.929] 0.929 0.852 | 0.852 | 0.852 | 0.852 




















b=upperentry. | b:=lower entry. 

















—1.392 |—0.917 
—2.892 |—2.892 
—1.392 |—0.917 
—2.417 |—2.417 














ay 392 |—0.917 | 
—1.785 |—1.785 j-1. 7 


—1.392 |—0.917 |—0. 
—1.319 |—1.319 |—1.: 
—1.392 |—0.917 |—0.2 
0.071 | 0.071 0.071 















































r=4.0 








C/Cp | 

1} 2 | 5 | 0 100 
—1.050 |—0.720 —0.267 | 0.081 | 1.179 | 
1.425 |—1.425 |—1.425 |—1.425 |—1.425 | 














2 |—1.050 |—0.720 |—0.267 | 0.081 | 1.179. 
gm 095 |—1.095 |—1.095 |—1.095 |—1.095 








cc, 5 “\=7.050 |—0.720 |—0.267 | 0.081 | 1.179. 
|-0. 642 |—0.642 |—0.642 |—0.642 |—0. 642 | 








70 |—1.050 |—0.720 |—0.267 | 0.081 | 1.179 | 
—0.293 | 0.298 |—0.293 |—0.293 |—0.293 | 
— 

1.050 |—0.720 |—0.267 } 0.081 | 1.178 | 
0.804 | 0.804 | 0.804) 0.804 | 0.804 | 





oad 





























bh=upperentry. %:=lower entry 





TABLE 267—Continued 
ki =1.0 and kz =3.0 
r=1.0 r=2.0 








Ce/C, CWC; 
°'5 Pp a P 


10 


—0.285 | 0.181 —0.683 |—0.184° 
—3.392 |--3.392 —2.053 | —2.053 


—0.285 | 0.181 —0.683 ~0.184 
—2.917 |--2.917 —1.683 |—1.683 


—0.285 | 0.181 —0.683 |—0.184 
—2.285 |-—-2.285 —1.184 |—1.184 


—0.285 | 0.181 —0.683 |—0.184 
—1.819 |—1.819 —0.808 |—0.808 


—0.285 | 0.181 —0.683 |—0. 184 
—0.429 |—0.428 0.332 | 0.332 

































































r=4.0 r=8.0 








Ce /Cp : Ce /Cp 
1 2 5 2 5 10 


—1.050 |—0.720 |—0.267 ‘ : —0.468 |—0. 130 
—1.550 |—1.550 |—1.550 ’ ‘ —1.455 |—1.455 


—1.050 |—0.720 |—0.267 ‘ ‘ —0.468 |—0. 130 
—1.220 |—1.220 |—1.220 : ‘ —1.147 |—1.147 


—1.050 |—0.720 |—0.267 ° —0.468 |—0.130 
—0.767 |—0.767 |—0.767 ° —0.718 |—0.718 


—1.050 |—0.720 |—0.267 ; —0.468 |—0.130 
—0.418 |—0.418 |—0.418 . —0.380 |—0.380 


—1.050 |—0.720 |—0.267 —0.468 |—0.130 
0.679 | 0.679 | 0.679 0.727 | 0.727 













































































bi=upperentry. b:=lower entry. 


ki =1.5 and ke=1.5 








¥ 














—0.933 
—0.933 


—0.933 my 








—0.374 |—0.374 


—0.933 |—0.374 
—0.058 |—0.058 
—0.933 |—0.374 |—0.058 | 1.081 

1.082 | 1.082 | 1.082] 1.081 















































r=8.0 








1 5 1 1 


—1.175 —1.268 
—1.175 —1.268 
—1.175 |—0.846 —1.268 
—0.846 |—0.846 —0.960 |—0.960 


—1.175 |—0.846 |—0.392 C-2)/ —1.268 |—0.960 |—0.530 
—0.392 |—0.392 |—0.392 —0.530 |—0.530 |—0.530 
—1.175 |—0.846 |—0.392 —1.268 |—0.960 |—0.530 |—0.193 
—0.043 |—0.043 |—0.043 —0.193 |—0.193 |—0. 193 |—0.193 





























—1.175 |—0.846 |—0.392 —1.268 |—0.960 |—0.530 
1.428 | 1.428] 1.428 0.914 | 0.914] 0.914 


—0.193 | 0.914 
0.914} 0.914 


! 


























Stee 

















bi=upperentry. b:=lower entry. 





TABLE 267—Continued 
ki =1.5 and ke =2.0 
r=1.0 








Ce /Cp 








—0.785 
—2.392 


—0.785 
—1.917 


—0.785 
—1.285 
—0.785 
—0.819 


—0.785 
0.571 




































































r=4.0 








C W7¢ 
1 a £ €" 


|—1.175 |—0.846 |-0.392 
—1:300 |—1.300 |—1.300 


—1.175 |—0.846 |—0.392 
| —0.970 |—0.970 |—0.970 


\Ce/Cp 5 |—1.175 |—0.846 |—0.392 
—0.517 |—0.517 |—0.517 


—1.175 |—0.846 |—0.392 
—0.168 |—0.168 |—0.168 


—1.175 |—0.846 |—0.392 
0.929 | 0.929) 0.929 













































































b:=upper entry. b:=lower entry. 


ki =1.5 and ke =2.5 
r=1.0 








Coe/Cp 








—0.785 
—2.892 


—0.785 
—2.417 


—0.785 
—1.785 


—0.785 
—1.319 


—0.785 
0.071 




































































r=4.0 








cS 7 [Cp 


} 1p 


—1.175 |—0.846 |—0.392 
—1.425 |—1.425 |—1.425 














—1.175 |—0.846 |—0.392 
—1.095 |—1.095 |—1.095 


~—1.175 |—0.846 |—0.392 C/Cp 
—0.642 |—0.642 |—0.642 


—1.175 |—0.846 |—0.392 
—0.293 |—0.293 |—0.293 


—1.175 |—0.846 |—0.392 
0.804 | 0.804) 0.804 

































































bi: =upper entry. ba=lower entry. 





TABLE 267—Continued 


ki =1.5 and ke =3.0 
r=1.0 








CNC 





—0.785 
—3.392 


2 —0.785 
—2.917 
C/Cp 5 |-1. = 9.785 


10 —0.785 
—1.819 


100 —0.785 
—0.429 




































































r=4.0 








CY /Cp 
1 2 5 10 1 


—1.175 |—0.846 |—0.392 |—0.043 —1.268 
—1.550 |—1.550 |--1.550 |—1.550 —1.455 


—1.175 |—0.846 |—0.392 |—0.043 —1.268 
—1.220 |—1.220 |—1.220 |—1.220 —1.147 


—1.175 |—0.846 |—0.392 |—0.043 05 —1.268 
—0.767 |—0.767 |—0.767 |—0.767 0.718 


—1.175 |—0.846 |—0.392 |—0.043 —1.268 
—0.417 |—0.418 |—0.418 |—0.418 —0.380 





























—1.175 |—0.846 |—0.392 |—0.043 —1.268 
0.679 | 0.679 | 0.679 | 0.679 0.727 















































bh=upperentry. 6:=lower entry. 


k, =2.0 and ke =2.0 













































































1 


—1.300 
—1.300 


—1.300 |—0.970 
—0.970 |—0.970 


—1.300 |—0.970 |—0.517 
—0.517 |—0.517 |—0.517 
—1.300 |—0.970 |—0.517 
—0.168 |—0.168 |—0.168 


—1.300 |—0.970 |—0.517 
0.929 | 0.929 | 0.929 










































































bi=upperentry. 6:=lower entry. 





r=1.0 


TABLE 267—Continued 
ky =2.0 and keg =2.5 





CRMC 


10 





2 








—1.285 
—2.892 


—0.819 
—2.892 





—1.285 
—2.417 


—0.819 
—2.417 











~-1.285 
—1.785 
—1.285 
—1.319 


0.819 
—1.785 


—0.819 
—1.319 











—1.285 
0.071 





—0.819 
0.071 








1 


—1.183 
—1.803 


—0.308 
—1.803 





2 


—1.183 
~ 1.433 


—6.308 
—1.433 





C/Cp 5 


—1.183 
—0.934 


—0.308 
—0.934 





10 


—1.183 
—0.558 


—0.308 
—0.558 





100 








—1.183 
0.582 








—0.308 
0.582 











r=4.0 





1 


2 


CNC 


10 











—1.300 
—1.425 


—0.970 
—1.425 


—0.517 
—1.425 


—0.168 
—1.425 





—1.300 
—1.095 


—0.970 
—1.095 


—0.517 
—1.095 


—0.168 
—1.095 





|—1.300 
—0.642 


—0.970 
—0.642 


—0.517 
—0.642 


—0. 168 
—0.642 





—1.300 
—0.293 


—0.970 
—0.293 


—0.517 
—0.293 


—0. 168 
—0.293 





—1.300 
0.804 





—0.970 
0.804 








—0.517 
0.804 





—0. 168 
0.804 















































b:=upper entry. 


b:=lower entry. 


k,=2.0 and ke =3.0 





r=2.0 





c-/¢ 
ay 





| 
C/Cp 


| 
| 
































|—0.684 
|—2.053 





—0.684 
—1.683 





—0.684 
—1.184 














—0.684 























1 


2 





1 


2 


10 


100 








—1.300 
—1.550 


—0.970 
—1.550 





—1.300 
— 1.220 
—0.767 


—1.300 
—0.418 





0.679 


; |—1.300° 


—0.970 
—1.220 
—0.970 
—0.767 
—0.970 
|—0.418 





—1.300 |—0.970 


0.679 











—0.168 
0.679 





0.929 





b =upper entry. 


be =lower entry. 





—1 
—1. 


-330 
455 


—1.022 
—1.455 


—0.255 
—1.455 


0.852 
—1.455 





2) 


C/Cp & 








0.679 
panel 


x 


—1.330 
—1.147 


—1.022 
—1.147 





—1.330 
—0.718 


1.330 
80 


—1.330 
0.727 


—1.022 
—0.718 


—0.255 
—1.147 
—0.255 
—0.718 


0.852 
—1.147 
0.852 
—0.718 








—1.022 
|—0.380 


—1.022 


0.727 


—0.593 


0.727 





—0.255 





—0.255 
0.727 





0.852 


0.852 
0.727 











TABLE 267—Continued 
ki=2.5 and kk=2.5 


























—1.785 
—1.785 


—1.785 |—1.319 
—1.319 |—1.319 


—1.785 |—1.319 | 0.070 
0.071 | 0.071 | 0.070 
























































r=4.0 








CoN /Cp 














—1.095 
—1.095 











.095 |—0.642 | —0.655 
0. .642 |—0.642 —0.655 


1 
1 
—1 
—0 
10 |—1. —1.095 |—0.642 =o: | —0.655 |—0.318 
0 
—1 
0 


C2/Cp 5 |—1. 
|- 


| 








—0. -293 |—0.293 |—0.293 —0.318 |—0.318 


.095 |—0.642 |—0.293 | 0.804 —0.655 |—0.318 
-804 | 0.804 0.804 | 0.804 0.789 | 0.789 











100 |—1. 
0. 












































bi=upperentry. b:=lower entry. 


ki=2.5 and kh=3.0 
r=1.0 








CY/Cp 
| 100 10 








—1.785 -319 | 0.071 —0.558 
392 -392 |—3.392 —2.053 


- 785 -319 | 0.071 “ —0.558 
917 |—2.917 |—2.917 : . —1.683 


785 |—1.319 | 0.071 —0.558 
"985 |—2.285 |—2.285 —1.184 


—1.785 |—1.319 | 0.071 - 803 —0.558 
—1.819 -819 |—1.819 ; —0.808 


|—1.7 .319 | 0.071 3 —0.558 
|—0.429 .429 |—0.429 0.332 






























































r=4.0 








C/Cp 
1 2 5 | 10 | 100 2 

-425 |—1.095 |—0.642 |—0.293 | 0.804 —1.085 
-550 |—1.550 |—1.550 |—1.550 |—1.550 —1.455 














-425 |—1.095 |—0.642 |—0.293 | 0.804 —1.085 
-220 |—1.220 |—1.220 |—1.220 | —1.220 —1.147 





—1.425 |—1.095 |—0.642 |—0.293 | 0.804 —1.085 
~0.767 |—0.767 |—0.767 |—0.767 |—0.767 —0.718 


—1.425 |—1.095 |—0.642 |—0.293 | 0.804 —1.085 
—0.418 |—0.418 |—0.418 |-0.418 |—0.418 : —0.380 























—1.425 |—1.095 |—0.642 |--0.293 | 0. —1.085 |—0.655 |—0.318 
0.679 | 0.679 | 0.679 | 0.679} 0.679 0.727 | 0.727 | 0.727 



































b=upperentry. b:=lower entry. 
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TABLE 267—Continued 
ki =3.0 and kz =3.0 
r=1.0 








Cc W/Cy 
*5 














—1.683 
—1.683 








—2.285 —1.683 |—1.184 
—2.285 —1.184 |—1.184 











—2.285 |—1.819 —1.683 |—1.184 
—1.819 |—1.819 —0.808 |—0.808 











—2.285 |—1.819 —1.683 |—1.184 
—0.429 |—0.429 0.332 | 0.332 






































r=4.0 r=8.0 








CC. 5 C, M/¢, 
¢ 5 ? c 5 P 

















| 








/Cp & |—1.550 |—1.220 |—0.767 
|—0.767 | 0.767 |—0.767 




















10 |—1.550 |—1.220 |—0.767 
0.418 |—0.418 |—0.418 | 





0.679 

















100 |—1.550 |—1.220 |—0.767 |—0.418 | 0.679 


| 0.679 | 0.679 | 0.679 | 0.679 








bi = upper entry. ba =lower entry. 





LOWER BOUND FORMULAS FOR THE MEAN 
INTERCORRELATION COEFFICIENT* 


Ricuarp H. Wi.u1s 
Carnegie Institute of Technology 


The mean intercorrelation coefficient, f, is defined and formulas are 
derived for its lower bound, fzz. First, an absolute lower bound on 4 
is obtained; it is a monotonically increasing function of the number of 
variables, N, and asymptotically approaches zero as N+. Second, 
a corresponding expression is obtained assuming a positive manifold 
(no negative intercorrelations), an assumption frequently made when 
all variables measure human aptitudes and abilities. Here, jr” >0, and 
is a joint function of N and n, the dimensionality of the vector space. 
Finally, the expression for jz is derived for the case that the assump- 
tion of a positive manifold applies to some subset of the variables, but 
not to the remaining ones. Depending on N, n, and the number of 
variables contained in the subset, 1g may be either positive or nega- 
tive. 


T 1s the purpose of this paper to develop lower bound expressions for the 

mean intercorrelation coefficient, p. In general, j is defined as follows: Given 
N distributions of variates, each based on the same population, there will be 
3N(N —1) possible sets of paired variates, and for each such set there will be a 
correlation coefficient, p,;;; the mean value of these p;;’s is, by definition, j.! 

It is well known that, although any number of variables can all intercorrelate 
+1.00, it is not possible for a set of three or more variables all to be inter- 
correlated —1.00. The same general idea is occasionally heard to be expressed 
as “everything can’t correlate negatively with everything else.” The formulas 
developed in this paper demonstrate that any number of variables may be 
intercorrelated negatively, on the average, albeit not highly so, unless the num- 
ber of variables is quite small. 

If N is the number of variables (7.e., the number of distributions of variates), 
and if no restrictions are placed on the direction in which variables are scored 
relative to each other, there will be 2” possible scoring conventions for the set 
of variables. The mean intercorrelation will depend on the particular scoring 
convention chosen, and for this reason it is necessary to provide some rationale 
for selecting one of the 2” scoring conventions before computing j. Further- 
more, if restrictions are placed upon the direction of scoring of some variables 
relative to some others, the expressions for the lower bound on pj must be 
modified. For present purposes it is convenient to distinguish between variables 
which are unrestricted in direction of scoring relative to other variables, and 
those for which it is required that the direction of scoring, relative to some 
other variables, is specified. It is possible, of course, to devise sets of restrictions 
on the relative directions of scoring which are inconsistent, but in this paper 
attention will be confined to a class of scoring restrictions which generates no 





* The suthor gratefully acknowledges several helpful suggestions made by Robert E. Krug, who read and com- 
mented on an earlier draft of this paper. 

1 For computational formulas for mean intercorrelations, see Peters,C. C. and Van Voorhis, W. R., Statistical 
Procedures and Their Mathematical Bases, New York: McGraw-Hill, 1940, 196-201. 
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inconsistencies and which has special relevance for mental aptitude measure- 
ment. 

Some theories of mental organization make the assumption of a positive 
mantfold. That is, they postulate non-negative correlations between any two 
mental abilities, or postulate that mental abilities may be subdivided into a 
few broad categories, within each of which negative intercorrelations will not 
be found.? Such a large body of evidence has built up in support of non-negative 
intercorrelations among certain groups of human aptitudes that many experi- 
menters would usually be more willing to eliminate as many negative inter- 
correlations as possible by reversing the direction of scoring for some variables, 
or to hold the data suspect, than to accept negative intercorrelations. For most 
other kinds of variables, on the other hand, it would not generally be desired 
to impose such restrictions on the relative direction in which variables are 
scored. 

The above considerations lead us to make the following definitions: 

1. An a-set is a set of variables, each pair of which is postulated to correlate 

non-negatively. 

2. A variable belonging to an a-set is an a-variable. 

3. A variable not belonging to an a-set is a 6-variable. 

The properties of j, and the value of jzz, its lower bound, will differ as one 
deals with a-variables only, 8-variables only, or with both. 

Let us represent each variable by a vector in n-space. Each intercorrelation 
can then be represented in the well known’ form 


rij = hy-hj-cos $j, (hi £ 1,h; © 1), (1) 


where h; and h;, the vector lengths, represent the reliable variances, and the 
@:; are the angles of separation. Because we are dealing with population 
parameters, all h;=unity, so that Eq. 1 reduces to 


Pij = COS Gi;. (2) 


A coordinate system is provided the vector space by n orthogonal reference 
axes; one pole of each reference axis is labelled positive, the other negative. A 
8-vector may have its projection on any reference axis on either the positive 
or the negative pole, whereas a-vectors, unless scored in the wrong direction, 
have projections only on positive poles. 

In the interests of clarity, we will consider the two homogeneous cases before 
developing the case of mixed a- and #-variables. Consider first the case of 
8-vectors only—the homogeneous beta case. The value of f will clearly be a 
minimum when the total angular separation between vector pairs, }-¢,;, is a 
maximum, for cos ¢,; is a monotonically decreasing function of ¢;; on the 
interval 0<¢;;<2. The value of oi; will be a maximum when vectors are 
dispersed as evenly as possible throughout the vector n-space—as if they were 
electrically charged rods mutually repelling each other. This follows from the 
fact that if and only if the vectors are arranged asymmetrically with respect to 








2 For a recent discussion of this topic, see Anne Anastasi, Differential Psychology (3rd ed.), New York: Mac- 


millan, 1958. 
* Thurstone, L. L., Multiple Factor Analysis, Chicago: University of Chicago Press, 1947, 89-90. 





MEAN INTERCORRELATION COEFFICIENT 277 


at least one axis in such a way that the sum of the projections on the positive 
pole is not equal to the sum of the projections on the negative pole, }°¢;; may 
always be increased, without changing the projections on the remaining axes, 
by shifting vectors so as to equate these two sums. In the case of uniform 
dispersion, the sum of the projections on each positive pole is equal to the sum 
of projections on the corresponding negative pole, and consequently no rear- 
rangement of vectors will produce an increase in yj. 

Exact uniformity of dispersion, 7.e., perfect n-dimensional radial symmetry 
will not be possible except in very special cases, such as when the number of 
8-vectors equals the number of reference poles. This fact does not concern us, 
however, for n-dimensional bilateral symmetry is a sufficient condition for 
maximizing }-¢,;, for this insures the equality of the sum of the projections 
on each positive pole with the sum on the corresponding negative pole. The 
condition of n-dimensional bilateral symmetry is always attainable whenever 
the number of 8-vectors is even. This may be accomplished by clustering half 
of the vectors about each pole of any reference axis, for such an arrangement is 
bilaterally symmetrical, not only about the one reference axis, but also about 
all the remaining ones.‘ 

Furthermore, when N is odd, }>¢,;, and thus §, is not a function of the 
position of the odd vector, so long as the other N —1 vectors possess n-dimen- 
sional bilateral symmetry. Consequently, the position of the last vector does 
not affect the value of jzs. This follows from the fact that bilateral symmetry 
guarantees that, as the position of the odd vector is changed, it moves closer 
to some of the other vectors, but moves farther by an equal amount, from the 
diametrically opposing vectors, so that }-¢,; remains constant. 

From the above considerations, it is obvious that, in order to maximize 
>¢:;, and minimize j, for any number of B-vectors, we need consider only two 
cases: 

1. When N is even, }-¢,; will be maximized by clustering all vectors along 

one of the reference axes, with N/2 vectors at each pole. 

2. When N is odd, the odd vector may be placed at will; }°¢,; will be maxi- 
mized, for example, if vectors are clustered (VN —1)/2 at one pole of any 
reference axis and (N+-1)/2 at the opposite pole of the same axis. 

This dispersion of vectors into two diametrically opposed clusters of equal or 
equal-but-one size will be referred to as the canonical dispersion of vectors for 
the homogeneous beta case. Canonical dispersion will be defined in general as a 
particularly simple dispersion of vectors throughout the vector space for which 
the condition of maximum total angular separation holds. Consequently, a 
concomitant of canonical dispersion of vectors is that j assumes its lower bound 
value, pre. 

To determine jzz for the homogeneous beta case, it is only necessary to 
count the number of within-cluster intercorrelations (of value +1.00), subtract 
the number of between-cluster intercorrelations (of value —1.00), and divide 





4 To take a concrete example, consider eight vectors radially symmetrical about the point of intersection of two 
reference axes, I and II. This arrangement may be replaced by one in which four of the vectors are clustered about 
the positive pole of axis I, and the remaining four are clustered about the negative pole of this axis. In both cases, 
the sum of the projections on each positive pole is equal to that on the corresponding negative pole. Likewise, both 
2¢ij and 2 cos ¢j; remain unchanged by the rearrangement, as may be easily verified by the reader. 
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this difference by the total number of intercorrelations. For N even, so that 
both clusters contain N/2 vectors, 


_ 2-4(N/2)(N/2—1)—(N/2)* 1 
i iN(N — 1) ee Pr 





(3) 


When N is odd, so that there are (VN +1)/2 vectors in one cluster and (VN —1)/2 
in the other, 
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Thus, for consecutive values of N of the form 2x—1 and 2z, xz an integer, the 
same value holds for jz». The value of jzz approaches zero from below as N 
increases. For N =3 or 4, p12 = —.333, but for N =9 or 10, the lower bound is 
rather close to zero, —.111. Any number of variables may intercorrelate nega- 
tively, on the average, so long as all or most of the intercorrelated variables are 
8-variables, and so long as one is willing to accept intercorrelations which are 
arbitrarily close to zero.’ Moreover, fs is independent of n, the dimensional- 
ity of the vector space—an interesting result. 

Eq. 3 or 4, depending on whether WN is even or odd, gives an absolute lower 
bound on j. However, if one makes the assumption that certain variables com- 
prise an a-set, so that within this set no intercorrelations are negative, the ex- 
pression for #zz must be modified. Consider the case in which all the variables 
under consideration are assumed to form an a-set—the homogeneous alpha 
case. In practice, if one accepts the assumption that one is dealing with a- 
variables only, one would also accept any observed negative intercorrelations 
remaining after proper reflection of vectors as being due to sampling error. 
Such residual negative correlations should not be significantly different from 
zero to be consistent with the assumption of the homogeneous alpha case. 

Since values of intercorrelations range only from 0 to +1.00, a-vectors can 
only be distributed throughout that portion of the n-dimensional vector space 
associated with all-positive sets of coordinates. That is, coordinates may be 
assigned only along the positive pole of each reference axis, thus restricting the 
available portion of the complete n-space to a fraction of one part in 2". Under 
this restriction the canonical dispersion of a-vectors takes the form of n-equal 
or equal-but-one-sized clusters, one cluster collinear with each of the positive 
reference poles. This configuration of a-vectors will obviously maximize }°¢4;, 
for the number of vector pairs at right angles to one another is maximized, 
while the number of collinear vector pairs is minimized. Letting N=np-+4q, 
where n is the dimensionality of the vector space, and requiring that 0<q 
<min (n—1, N) it follows that there will be (n—gq) vector clusters of size p, 
and q clusters of size (p +1). The number of within-cluster intercorrelations (of 





6 Similar sendin, formally equivalent to Eq. 3, are obtainable through an analysis of variance or intraclass cor- 
relation approach. See, for example, R. A. dows Statistical Methods for Research Workers (12th ed.), Edinburgh: 
Oliver and Boyd, 1954, p. 214. 
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value +1.00) will be equal to (n—gq)-4p(p—1)+q-3(p+1)p, while the number 
of between-cluster intercorrelations will be equal to (n—gq)p?+q(p+1)? 
+(n—q)q:p(p+1), all of which are value zero and so do not enter into the 
numerator of the expression for prs: 


(n — q):3p(p — 1) +q-3(p + 1)p 
3(np + q)(np + q — 1) 





(5) 


Pita = 
which reduces to 
_ (N — n)p + pq 


p = ’ 6 
me” NW = 1) " 





where q is, at most, either N or n—1, whichever is smaller. If the number of 
a-vectors is an exact multiple, p, of n, then g=0, N=np, and Eq. 6 further 
reduces to 


PLB 


p-l N-n 
N-1 n(N—1) 


(7) 


Notice that jzz is not independent of the dimensionality of the vector space, 
in contrast to the result obtained in the homogeneous beta case. And whereas 
pre <0 in the homogeneous beta case, in the homogeneous alpha case, pr2>0, 
by Eqs. 6 and 7. For a given number of a-vectors, and so long as N>n, the 
greater the dimensionality of the vector space, the algebraically smaller the 
lower bound on p. For n>N, przs=0. This means that, when dealing with 
a-variables, one is faced with the problem of estimating n in some way (e.., 
by estimating the rank R of the intercorrelation matrix, or by utilizing knowl- 
edge gained from previous experience with the measures in question) before 
computing pre. 

We now proceed to the heterogeneous case, where both a-variables and 
8-variables are included in the complete set. The canonical dispersion of vectors 
will differ depending upon which of two conditions holds. In Condition 1, in 
which the number of 8-vectors is equal to or greater than the number of a- 
vectors, the canonical dispersion will be identical to that of the homogeneous 
beta case, and consequently the same expressions for jzs, Eqs. 3 and 4, apply. 
When Condition 2 holds, viz., when the number of a-vectors exceeds the num- 
ber of 6-vectors, the canonical dispersion will be of a somewhat more compli- 
cated form. If there are NV, a-vectors and Nz, 6-vectors, and Naz>WNg, then the 
following operations will define the canonical clustering of vectors: 

1. Cluster all Ns of the 8-vectors about one of the negative poles. 

2. Cluster Ng of the a-vectors on the positive pole of the same axis. 

3. Divide all remaining a-vectors into n equal or equal-but-one-sized clus- 
ters, of p or p+1 vectors each, and place one cluster along each of the n 
positive poles, including the positive pole which already carries a cluster 
of a-vectors. 

For the sake of definiteness, and without loss of generality, assume that the 
positive pole which received Ng vectors in Step 2 above will always receive p 
(rather than p+1) vectors in Step 3. The number of clusters of size p+1 in 
Step 3 will be denoted qg, and g<min (n—1, N—2N,). 
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The resulting canonical clustering of vectors will consist of one cluster of 
Nz 8-vectors on a negative pole, one cluster of (Ns+p) a-vectors on a positive 
pole, g clusters of p+1 a-vectors on positive poles, and (n—q—1) clusters of p 
a-vectors on positive poles. It can be seen that this particular arrangement of 
vectors will minimize j from the following considerations. Adding a vector to 
a cluster of VN, vectors on a positive pole which is opposed by Ng vectors on the 
corresponding negative pole will increase }-p.; by (Va—Nz). Adding a vector 
to a cluster of, say, N, vectors which is opposed by no vector will increase 
>i; by N,. The procedure indicated in the steps outlined above will insure 
that, as vectors are added one by one, if (Va—Ns)#N,, the smaller increment 
to >>p.; is always chosen, thus guaranteeing j to be as small as possible, 
algebraically, after the addition of any number of vectors. 

The total number of +1.00 intercorrelations will be equal to 3$(Ns+p) 
(Ne+p—1)+4N,(Ns-—1), from the two diametrically opposed clusters, plus 
(n—q—1)-4p(p--1)+¢-3(p+1)p, from the n—1 remaining clusters. The total 
number of — 1.00 intercorrelations will be Ng(Ns+>p), the product of the cluster 
sizes of the two diametrically opposed clusters. All other intercorrelations will 
be zero. Dividing the algebraic sum of all p;;’s by the total number of p;;’s and 
simplifying yields 


ie np(p — 1) + 2(pq — Ng) 
a N(N — 1) 





(8) 


where N, n, and Ny are given, and the integers p and gq must be determined 
from the relationship N =np+q+2N, and the stipulation that q is to be taken 
as small as possible. In no case is q larger than min (n—1, N —2Ng). As stated 
earlier, Eq. 8 is to be used only in the case that Na>WNsz. 

When N.—WNgz equals precisely some multiple p of n, then g=0, and the 
expression further simplifies to 

np? — N 
PLB N(N — 1) (9) 
Within the interval 1<n<(N—2N,), an increase in n corresponds to a decrease 
in pis, but the lower bound is independent of further increases in dimensional- 
ity. 

In summary, so long as at least half the variables are 6-variables, pzz is 
independent of n, the dimensionality of the vector space, whereas if the major- 
ity of variables are a-variables, an increase in n up to a value of N —2N¢ ex- 
tends the lower bound. In the homogeneous alpha case, jz2>0, in the homo- 
geneous beta case, jr2a<0, and in the heterogeneous case, pre may be either 
positive or negative. Zero is the upper bound on jzz in the homogeneous beta 
case, approached as N+. At the same time, zero is also the lower bound on 
pre in the homogeneous alpha case, the necessary and sufficient condition for 
pre=0 being that n>N. 
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ON CERTAIN TYPES OF RECREATION* 
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Recreation is an ill-defined human activity; yet, by any reasonable 
definition, it is an important and rapidly growing one. Economists, 
statisticians, and other social scientists have given it comparatively 
little professional attenvion. Relatively large bodies of statistical ma- 
terial exist, mostly as a by-product of administrative, informational, 
or public relations activities. The form of publication of these data has 
not made their analysis easy. Definitional and conceptual problems 
plague the whole field. Yet it is a potentially rich one for the statistician 
and other researchers; interesting and significant problems abound, and 
the practical utility of research would likely be high. A primary purpose 
of this article has been to call these opportunities to the attention of 
readers of JASA. 


ECREATION as a human activity is growing rapidly in importance and in 
magnitude, yet recreation as a field of economic and other social science 
research is poorly developed. 

There has been some research, and more writing in the general field of recre- 
ation. A number of persons with direct experience in some part of the recreation 
field have put their ideas and experience into writing.* Some of these writers 
veveal much knowledge and shrewd insight into one or more of the varied rami- 
fications of the recreation field. Some have conducted semi-experiments or 
taken measurements, or drawn from operating experience, so that their work 
has some quantitative as well as operational content. But these and others 
have not been primarily researchers; carefully planned experiments, of sound 
design and subject to rigorous statistical and logical analysis, have been rare 
indeed. As will be shown later, statistical inquiries, even into the limited statis- 
tics now available, have been comparatively few. In total, we know, or think 





* This paper was prepared to supplement statistics on recreation in the forthcoming new edition of Historical 
Statistics of the United States compiled under the joint sponsorship of the Bureau of the Census and the Social Science 
Research Council. 

1 Some of the introductory discussion deals with all types of recreation; the greater part of the statement is 
concerned with outdoor recreation, organized sports, and commercial amusements. The scope of this essay has 
been determined by the arrangement of data in Historical Statistics; see 14. 

2 The views expressed here are personal only. 

3 The genera sources used I have found most useful include the following: 

George D. Blutler, Introduction to Community Recreation, McGraw-Hill Book Co., New York, 1949. 

George Barton Cutten, The Threat of Leisure, Yale University Press, New Haven, 1926. 

Foster Rhea Dulles, America Learns to Play, D. Appleton-Century Co., 1940. 

Luther Halsey Gulick, A Philosophy of Play, Charles Scribner’s Sons, New York, 1920. 

Johan Huizinga, Homo Ludens, Routledge & Kegan Paul, London, 1949. 

Harold D. Meyer and Charles K. Brightbill, Community Recreation—A Guide to Its Organization and Admin- 
istration, D. C. Heath & Co., Boston, 1948. 

Harold D. Meyer and Charles K. Brightbill, State Recreation: Organization and Administration, A. 8. Barnes & 
Co., New York, 1950. 

Martin H. and Esther S. Neumeyer, A Study of Leisure and Recreation in Their Sociological Aspects, A. 8. 
Barnes & Co., New York, 1949. 

Arthur Newton Pack, The Challenge of Leisure, Macmillan Co., New York, 1934, 

G. Ott Romney, Off the Job Living—a Modern Concept of Recreation and Its Place in the Postwar World, A. S. 
Barnes & Co., New York, 1945. 

Jesse Frederick Steiner, Americans at Play—Recent Trends in Recreation and Leisure Time Activities, McGraw- 
Hill Book Co., New York, 1933. 
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we know, a good deal about recreation, at the “practical” level, but we have 
given it comparatively little formal scientific and statistical inquiry. 

The sociologists are perhaps more of an exception to the foregoing than are 
other professional groups. A good many studies have been made, and even 
more writing presented.‘ Most of the sociological work seems to have been 
aimed primarily at fellow sociologists and perhaps to be as concerned with 
methodology as with application of results. Possibly the same implied criticism 
will in the future be levelled at other professional groups, if or when they get 
more closely interested in recreation; perhaps this is but another expression of 
the communications problem that so often separates different professional 
disciplines. But, however generous an appraisal one might make of sociological 
studies of recreation, it is clear that they have not covered all aspects of the 
activity that are of increasing national interest. 

There has been built up a considerable body of literature on recreation, of 
the foregoing general types. In comparison with other human activities of 
equal importance—however one measures “importance” !—such literature is 
relatively small. As we hope to show later, much other material that is not 
ordinarily dredged up in typical bibliographies is more useful for the economist 
and other serious social science researcher. 

The purpose of this article is to consider briefly the nature of recreation as a 
field of human activity; to suggest some of the lines of economic inquiry that 
might be undertaken on recreation ;° and to list some of the sources of data and 
their limitations. 


1, RECREATION AS A HUMAN ACTIVITY 


One basic difficulty is that there is far from agreement as to what “recrea- 
tion” means. Confusion and variable meanings exist as to kinds of activity, 
kinds of expenditures, and on other matters. 

First of all, perhaps it is useful to distinguish between “recreation” or the 
activities of various kinds which people engage in for fun or pleasure; and “re- 
creation” or the effect which the activity has upon the minds, spirits, and 
emotions of those who engage in it. The regenerative therapeutic, vitalizing 
role of recreation (as an activity) is stressed in many places; this will not be 
our primary concern. We are most interested in recreation as an activity or 
group of activities. 

At the one extreme, only those things are considered recreation which are 
clearly fun or sports activities and which are engaged in by groups of persons. 
Thus, sporting events such as baseball or football, movies or theater, and 





4A recent illustration is the American Journal of Sociology for May 1957 (Vol. LXII, No. 6). Another recent 
symposium by and for professional personnel, but not strictly sociological, is The Annals, September 1957 (Vol. 313), 
“Recreation in the Age of Automation.” 

’ Two recent bibliographies are of interest: (1) Reuel Denney and Mary Lea Meyersohn, “A Preliminary 
Bibliography on Leisure,” American Journal of Sociology, Vol. LXII, No. 6, May 1957. (2) A Guide to Books on 
Recreation—An Annotated List of over 850 Selected Titles, published by National Recreation Association as the 
September 1957 issue of its magazine Recreation (Vol. L, No. 7, Part II). It is perhaps significant that out of about 
180 and over 850 titles, respectively, only one is common to both lists. While the central focus differs, the greatest 
difference between the two is perhaps in the professional backgrounds and viewpoints of the compilers. 

*S. T. Dana, Problem Analysis Research in Forest Recreation, Forest Service, U. 8. Department of Agriculture, 
April 1957. 
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similar group entertainments are considered recreation. From such a narrow 
focus, definitions widen out in many ways, until at the other extreme, all leisure 
time or fun activities are included. When such a wide definition is used, all 
participant sports, all spectator activities, all hobbies, reading, radio and tele- 
vision, gardening, hobbies of all kinds, and many other activities are included. 
Indeed, some forms of routine living chores, such as some kinds of cooking, may 
then become recreation. Between these extremes lie many other definitions, and 
confusion of terms, some of which we shall try to illustrate. 

While we usually think of recreation in positive terms, as fun or pleasure, it 
may have a negative side also. Some recreation is essentially escape from other 
and less desirable activities or situations, which may themselves be called 
“unfun.” Thus, bowling or poker or fishing might not be a desire to have fun 
so much as it is a desire to escape from familiar surroundings (such as wives) 
or from pressures of work. The familiar Western and murder mystery function 
in the same way, of course. Perhaps a more general definition of recreation is 
that it is activity, which by comparison with what the recreationist would 
otherwise be doing, offers promise of pleasure and fun (relatively).? 

Concurrently with these varying definitions of activities are equally varying 
definitions of expenditures included as recreation. Under the usual national 
income or national expenditure accounting, many activities that might be in- 
cluded as recreation are not. Travel, food and clothing required for recreation, 
even housing (such as mountain or seashore cabin), and many other items are 
included as transportation, or food, or shelter, rather than as recreation. At 
the one extreme, only those expenditures for admissions to the group types of 
recreation or those expenditures for items such as sporting goods which could 
have no use other than for recreation, are included as recreation. But this so 
obviously leaves out many types of expenditures, the motivation for which is 
fun or use of leisure time, that other definitions of recreation expenditures are 
used. Some of the specific estimates used in recent years will be presented later. 

In this confusion or contradictory use of terms, one issue is whether to in- 
clude all types of fun or leisure activities, or to include only those socially 
approved. For instance, should expenditures for liquor, tobacco, and similar 
matters be included as recreation? One group of students has; others have not. 
(As far as is known, no one has included payments to prostitutes, or for dope.) 
Shail expenditures for books, magazines, radio, television, and similar matters, 
and time spent on them be considered recreation or education?® Obviously, 
there is great difference between a scientific treatise or a serious novel and 
frothy fiction, between the serious magazines and the comic books. Classifica- 
tions built upon physical objects or physically observable actions have simplic- 
ity and objectivity, but they may be of dubious usefulness in studying human 
behavior. 

Yet another confusion exists, between recreation and travel. To some travel 
agencies and others catering to tourists, recreation and travel seem almost 
synonymous. The American Automobile Association has studied travel almost 





7 I am indebted to my colleague, Francis T. Christy, for the ideas in this paragraph. 
* In this essay, to conform with the plan for Historical! Statistics, we have excluded them from recreation. 
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in this way, as has the Curtis Publishing Company, for instance.® These groups 
or persons are interested in expenditures made by tourists and vacationists 
while travelling—expenditures for gas and oil, for other travel items, for food 
and lodging, and for the kinds of activities and goods which tourists pay for 
while on vacation. This type of vacation expenditure has built literally an in- 
dustry, the tourist industry, in many areas.'° To those areas, to whole states, 
and even to some foreign countries, such travel is critical. But it is clearly not 
synonymous with recreation. Some travel, even at the same seasons and to the 
same places, is for business, not for recreation. More importantly, much recrea- 
tion involves little or no travel. Many people spend their vacations, not to say 
their other leisure time, without getting far from their homes. 

The basic difficulty seems to be that recreation, in the broadest sense, is 
concerned with why people do things. If some activity is carried out for fun, 
rather than for gain or comfort or from necessity, it may be considered as 
recreation. If an expenditure is made for fun, rather than for necessity, comfort, 
or further earnings, perhaps it too can be considered recreation. All sorts of 
common experiences and expenditures may thus, under some conditions, be 
recreation—if they are for fun, not for economic gain or from necessity. This 
extremely broad definition may be narrowed down, still retaining the essential 
idea of the spirit in which the activity is carried out or the expenditure made. 
In contrast, most classifications of economic activities and of expenditures are 
concerned with specific activities or kinds of goods, which may or may not have 
the fun motivation that we have considered essential for recreation. 

In the present state of research and discussion about recreation, it seems clear 
that no definition would be acceptable to all groups and for all purposes. Per- 
haps the most practical course is a definition each time the word or idea is 
used, unless the context makes the meaning clear. At the same time, each 
writer or student should recognize that his is not the only, or perhaps the best 
accepted, use of the term. This state of affairs is admittedly unsatisfactory; 
perhaps wider discussion may lead to more specific definitions to cover the vari- 
ous types of situations which exist. But it is surely less confusing to recognize 
this situation and act accordingly, than to pretend it does not exist. 

The foregoing general discussion may be illustrated by three different uses of 
essentially the same basic data, in recent treatises or data releases on recreation 
(Table 285). The Department of Commerce collects data, makes analyses, and 
publishes the results, on gross national product, national income, and con- 
sumer expenditures. One broad grouping under the latter is “recreation.” This 
agency estimates that in 1952 nearly $114 billion was spent for recreation; of 
this, about 23% was for radio, television, and musical instruments; 20% for 
sports equipment; nearly 19% for reading; and lesser amounts for other items. 
Using the same basic data, Dewhurst and associates came up with a total 
roughly $1 billion smaller for the same year. A major difference in the two 
estimates was in reading materials; Dewhurst excluded books and magazines 





* American Automobile Association, Americans on the Highway, Washington, 1953. Research Department, The 
Curtis Publishing Co., The Travel Market among U. S. Families with Annual Incomes of $5,000 or More, Philadelphia, 
1955. 

1° Michigan State University has developed a Resort and Motel Institute to assist business enterprises in this 
general field, for instance. The Federal Reserve Bank of Boston in its periodic reports presents data on the tourist 
“industry "—bookings, activity, etc. 
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TABLE 285 


PERSONAL CONSUMPTION EXPENDITURES FOR RECREATION 
ACCORDING TO DIFFERENT DEFINITIONS 








Expenditure as estimated by 





Dewhurst! Department Fortune 
ewhuret of Commerce? magazine*® 


Item of expenditure 
(1952) (1952) (1953) 





(millions of dollars) 








Sports equipment: 2,153 2,279 
Nondurable toys & sports supplies 1,164 
wheels goods, durable toys, etc. 1,115 
boats and pleasure craft 


Radio, television, & musical instruments 
Purchases 
Repairs 


Participant recreation 
Pari-mutuel and coin machines 
Billiards and bowling 
Golf 
Other 


Spectator amusements 
Motion picture theaters 
Legitimate theater, miscellaneous 
Spectator sports 


Organizations and clubs 
Reading: 
Books and maps 
Magazines, newspapers, sheet music 


Flowers and plants 


Other 
Dining Out 1,030 
Alcoholic beverages 8,860 
Vacations, weekends and foreign travel 9,190 














” 
Total | 10,489 11,374 30,600 








1 Dewhurst and Associates, America’s Needs and Resources—A New Survey, Twentieth Century Fund, New 
York, 1955. 

2 Survey of Current Business, July 1956. 

3 Editors of Fortune, The Changing American Market, Hanover House, Garden City, N. Y. 1955. 

4 Pari-mutuel only. 


deemed educational or scientific. The remaining differences were due to the 
fact that Dewhurst used data available from the Department of Commerce 
when his book was in preparation; since then, the Commerce series has been 
revised. When revisions of the magnitude shown in the Table are made, on 
statistical bases which are at best somewhat less than satisfactory, one cannot 
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but question the accuracy of all the data. The editors of Fortune, in their studies 
of The Changing American Market, go to the Department of Commerce for 
their basic data but employ a wholly different definition, by which a total 
expenditure of over $30 billion on recreation is obtained. Two major additions 
were made by Fortune: alcoholic beverages; and vacations, week ends, and 
foreign travel, each of which accounts for about $9 billion of the difference. 
They also included dining out, which accounts for about $1 billion. Their 
breakdown on other items is slightly different, so that exact comparisons can- 
not be made. 

Each of these, and perhaps other, definitions can be defended—the makers 
of these surely felt that their estimates were the soundest that could be devised. 
While there may be variations in specific estimates, the basic difference is a 
definitional one—what should be included under the concept of recreation. But 
these widely differing estimates do reveal clearly the lack of a well accepted 
definition. 

The Dewhurst estimates show expenditures in terms of 1950 dollars for 
selected years from 1909 to 1939, since which date all years have been included. 
On this basis, total expenditures for recreation have roughly trebled since 
1909; on a per capita basis, they have slightly more than doubled during these 
years; as a percentage of all consumer expenditures, they have risen from 3% to 
slightly above 5%. These changes have not been regular; expenditures (in 1950 
dollars) on recreation rose rapidly up to 1929, both absolutely and as a per- 
centage of all consumer expenditures; fell sharply in absolute terms and also 
relative to total expenditures during the depression; rose absolutely but not 
much relatively to other expenditures during the war; and rose both absolutely 
and relatively in the postwar period. Substantial further increases absolutely, 
as population increases and per capita income rises, seem certain; and some in- 
crease relative to other expenditures seems probable in the future. The Fortune 
study makes a good deal of the fact that recreation expenditures in any year 
tend to reflect general income and expenditure relationship of the preceding 
years; this lag operates both when income is rising and when it is falling. 

A total expenditure for recreation, and consequently a recreation “industry” 
gross output, in the $10 to $12 billion range on the basis of the Dewhurst and 
Department of Commerce estimates or of roughly $30 billion according to the 
Fortune estimates, each in the early 1950’s, may be compared with the following 
other economic magnitudes, also for the period of the early 1950’s, and each 
expressed in billions of dollars: 





. Current earnings of all member banks of Federal Reserve system 
. Value added by blast furnaces, steel works, and rolling mills 
. Revenues from ultimate customers, electric light and power industry 
. Operating revenue, steam railways 
. Value of all mineral production in U. 8. 
. Retail trade of grocery stores 
. New construction in the Urited States 
. Gross income to agriculture 
Value added, all manufacturing 


Nn Go SQ oO 


ownwre- 
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Thus, even on the narrower definition of recreation, in comparison with 
other well known economic activities in the United States, recreation ranks as 
a major economic and social activity. A careful perusal of economic literature 
and statistical sources will show very much less attention to recreation than 
to the other items listed of roughly equal gross output or expenditure. 

Moreover, expenditures do not properly characterize certain types of recre- 
ation, notably most outdoor recreation. Many such activities are undertaken 
precisely because they yield relatively large satisfactions to the participants at 
modest cash outlays. An outdoor picnic in a nearby municipal park may cost 
very little, for instance—food is brought from home, at little or no cost other 
than would have been incurred at home, and the cost of driving the family car 
is low. The total outlay by individuals, by any accounting, will be much less 
than if the family had gone to the movies. Vacations in national parks and for- 
ests are often very much less expensive than at private resorts; and other illus- 
trations could be given. Expenditure data have value, both because of the inter- 
est in expenditures and because it is only through monetary measures that 
comparisons can be made between different recreation activities; but such data 
are inadequate for some kinds of recreation. 


2. ASPECTS OF RECREATION FOR STUDY 


The need for statistical data on recreation activity is crucial—however one 
defines recreation. However, the needs for data in research, policy formation, 
and administration of recreation dictate the kind and amount of data to be 
collected. The basic reason for research on recreation is to provide a sound basis 
for making policy and management decisions on the development of recrea- 
tional facilities for future needs. The demand for recreation, the capacity of 
areas to supply it, and the interaction between demand and supply, all require 
careful study. The following paragraphs describe these needs more specifically. 
If statistical data could be collected on the aspects of recreation described be- 
low, significant economic and other analyses would be possible. 

First, there is the matter of the resources used for recreation purposes. These 
in turn may be divided into natural resources, and capital or man-made ones. 
Land includes forests and other vegetation, wildlife and other biological phe- 
nomena, and water, as well as the soil itself. The total area of land may be classi- 
field into categories on any one or several alternative bases of classification: 
land type, such as bottom lands, hills, and mountains; land areas as contrasted 
to water area; vegetative cover, including tree species, size and age of trees, and 
other items for forested areas; the degree of improvement of the site, with vari- 
ous categories ranging from wild or wilderness to improved campsite; and the 
relative extent of use of the area, ranging from buffer strips to improved and 
heavily used areas. 

For many types of recreation, the land and other natural resource require- 
ments may be small and may seem unimportant. Sports of most kinds require 
comparatively small areas of land, yet the physical qualities of the site may be 
important, and the location more so. Commercial indoor amusements may 
require even smaller areas, but location with respect to other activities of the 
potential users becomes even more important. 
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An analysis of the natural resource needs of recreation of different types may 
be limited to the primary or initial needs, or it might be extended backward 
to include indirect and secondary requirements. To use a case that is perhaps 
extreme, a bowling alley requires a certain number of square feet of floor space, 
suitably located with respect to other commercial activities and to the resi- 
dences, working places, and transpcrtation routes of potential bowlers. This is 
the primary need. Bowling alleys must often be heated; this leads to a second- 
ary requirement for fuel resources, and there is also the need for hardwood 
floors and alleys. An attempt to pursue resource requirements back of the 
primary ones would likely soon get lost in a maze of uncertainties, and yet in 
some situations it might be significant. 

As far as is known, no generally accepted and widely used system of classifi- 
cation of recreation areas, or even of outdoor recreation areas, has been 
adopted. A comprehensive classification should include all types of recreation, 
however that may be defined. Even in the comparatively narrow field of recre- 
ation areas provided by states, the National Park Service has recognized 55 
different types, or at least that many different names, of kinds of areas which 
it includes in the generic class of “parks,” as well as other kinds of areas not 
included, such as state forests, state wildlife areas, highway waysides if adminis- 
tered by Highway Departments, and others." That same agency has suggested 
a more inclusive system of classification of outdoor recreation areas, but this 
has not as yet been published. The National Conference on State Parks has 
adopted a system of classification of outdoor recreation areas. Their study 
used a six-fold classification: State parks, State monuments, State recreation 
areas, State beaches, State parkways, and State waysides. These classification 
systems are deficient, however, not only in lack of inclusiveness but sometimes 
in the lack of a consistent scheme of classification. 

Hardly any type of recreation is possible with only unimproved natural 
resources; some capital investments are almost always needed, and sometimes 
far overshadow the natural resource base. Even for picnicking and camping in 
relative primitive surroundings, some capital investment in roads, trails, water 
supply, sanitary facilities, and other basic services is necessary. For many out- 
door recreation uses, capital investment in improvements placed upon the land 
far exceed the original value of the land. This is almost always true for sports 
facilities, such as golf courses, tennis courts, swimming pools, and the like. For 
indoor recreation, capital investments are generally much higher, not only in 
the buildings but in specialized facilities. If radio listening and television watch- 
ing are included as recreation, then these require comparatively large capital 
investments and limited natural resources, at least primarily. 

It would be desirable to have data on total investment in various types of 
recreation facilities, rates of depreciation and annual depreciation charges, and 
amounts of new investment each year—all this by kinds of recreation, and its 
geographic location. An economist might well analyze investment in and use 





"National Park Service, Department of the Interior, State Parks—Areas, Acreages, and Accommodations, Wash- 
ington, 1955 (processed). 

12 Committee on Suggested Criteria, Suggested Criteria for Evaluating Areas Proposed for Inclusion in the State 
Park Systems, National Conference on State Parks. Published in Planning and Civic Comment, December 1954. 
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of capital by recreation, just as he would analyze similar factors for the chemical 
industry or the food processing industry in a particular area. 

Second, the annual inputs of labor, various raw materials, and other items 
going into recreation each year might well be the subject of concern or study. 
For outdoor recreation, these would include the men supervising the recreation 
area, repairing and maintaining its roads, trails, and structures, and other 
related expenses. For sports, inputs would include the labor of instructors, 
coaches, and other officials, and materials such as balls of appropriate kind and 
other sports equipment. For radio and television, the inputs of talent for pro- 
grams and other personnel would normally be a considerable item. For books, 
magazines, and similar items, paper and other inputs would be involved. The 
variety of materials inputs into different kinds of recreation would be very 
great, and the kinds of specialized skills among manpower would also be highly 
varied. There would often be general purpose equipment, such as trucks and 
autos, as well as specialized equipment; and in the repair of buildings and 
structures, there would be the use of lumber, cement, and other general pur- 
pose materials. For the economist at least, the material, labor, equipment, and 
other inputs into the recreation “industry” would be a matter of interest. 

Third, a consideration of resources and of annual inputs easily leads to the 
matter of the supply of recreational opportunities, or the capacity of various 
facilities and areas to supply recreation. Various kinds of recreational activity 
have rather definite capacities in relation to the resources used and the expend- 
itures made on them. For instance, a swimming pool, when too many people 
try to use it, no longer provides enjoyable swimming for some would-be swim- 
mers and perhaps for no one. In some cases, capacity is rather obvious and 
sharply defined. For instance, more than one person can hardly read a book at 
once, only two or four persons can play tennis upon a court at a given time, 
and the number of golf foursomes that can get on a course at one time is clearly 
limited, even with the greatest crowding. In a movie or theater, the number of 
seats sets a capacity, except as people will stand, and even this space is limited. 
In other cases, capacity is much more a subjective factor. This is especially the 
case with much outdoor recreation. To take an extreme case, a wilderness area 
loses its charm for those who go there if appreciable numbers of persons are 
encountered in a day’s travel. If a picnic area becomes too crowded, then its 
attractiveness has been dimmed, at least for many people. 

The capacity of a particular type of recreation facility depends not only 
upon its square feet, its facilities, and its maintenance and management, but 
also upon the pattern of use which people try to make of it. This in turn is re- 
lated to their daily and weekly schedules of activity, which grow out of the 
whole culture and economy of which they are a part. If a tennis court is used 
only by adults after work, then during most of the day it is wholly unused; its 
capacity is limited to the number that can use it at the time it is in demand. 
Extreme peaks of demand and little or no use at other times characterize most 
kinds of recreation facilities. For outdoor recreation, the use is greatest after 
work and after school, week ends, holidays, and vacations; for most areas, in 
summer rather than at other seasons, but for nearly all areas, at one season and 
not at others. Most indoor recreation has sharp peaks also, often in the eve- 
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nings, but the time depending in large part upon the particular kind of recrea- 
tion and its clientele. Capacity is often supplied in relation to a physically pos- 
sible maximum use; but this means much idle capacity, on the average, and is 
undoubtedly a factor in the cost of many kinds of recreation. 

The capacity of outdoor recreation areas to supply recreation is a matter 
which requires further study. That there is some limit in intensity of use be- 
yond which the area ceases to be able to supply what users want—or at least, 
what some users want—is beyond question. The problem often comes to recog- 
nize this point, and to relate it to the characteristics and desires of potential 
groups of users. There may also be a physical limit in some cases. That is, the 
trampling of many feet may destroy the unique vegetation which the visitors 
came to see. There is a need to develop measures of “recreation capacity,” 
analogous to carrying capacity for grazing areas and sustained yield output for 
forests. 

Fourth, the output of a recreation facility is a matter of importance. Recre- 
ation output has at least two dimensions: volume, which is described by such 
measures as numbers of admissions to movies, numbers of visits recorded to a 
national park, and similar numbers for other types of recreation; and content 
or quality. The latter is most difficult to measure; is it largely a sociological or 
psychological phenomenon. Some people will get immense pleasure out of what 
others would find boring or unattractive. Some people, for instance, love camp- 
ing in the outdoors; others regard it as dirty, uncomfortable, disagreeable. 
Some people will not enjoy an area unless it is “popular,” i.e. crowded with 
many others; but other people will find satisfaction only if the use by others is 
below some threshold they regard as tolerable. Thus, any attempt to measure 
the quality of recreation must begin with the consumers or users, and their 
wishes; it may be possible only to say that certain groups, of certain numbers, 
locations, age and sex, income, and other personal characteristics find certain 
types of recreation satisfying. Activities which will appeal to small children will 
not satisfy teen-agers, and their favored activities are not likely to be enjoyed 
by adults, while old people have their special activities, to cite but one obvious 
basis of classification. 

Yet, within the group which enjoys camping, there might well be agreement 
that one spot, or one way of using a kind of resource, was more desirable than 
another. Certainly, some movies are greatly more popular than others, and 
some TV programs have a larger audience than others. There are some quality 
differences that would be widely accepted as better than others. Likewise, one 
method of using a particular type of outdoor resource may be more pleasing to 
a larger number of people than will another. 

As will be shown later, data on output of recreational activities are often 
missing or seriously incomplete. Surprisingly, this is true not only of quality 
measures, but of quantity measures as well. Data on attendance at football 
games have never been systematically collected and analyzed, for instance. 
Data on use of outdoor recreation areas are seriously deficient, perhaps more 
so on the matter of output than on any other factor. Yet, as long as the quality 
dimension of recreation has not been defined or measured in any systematic 
way, data on quantities could at the best be of limited meaning. 





DATA FOR RESEARCH ON RECREATION 991 


- 


Fifth, closely related to the matter of the output of recreation is the demand 
for it. Total demand for recreation is related to the numbers of people in the 
tributary area, although the age distribution and other characteristics of the 
population may affect the demand for certain kinds of recreation. It seems 
reasonable to assume that the demand for recreation is also affected by the 
average income of the population, since a larger absolute sum and probably a 
larger proportion of income is spent by high than by low income people. Other 
factors, such as transportation facilities, may affect the demand for recreation, 
particularly of certain outdoor kinds. And the amount of leisure of the popu- 
lation served by a recreation facility will also affect the demand for recreation. 
The experience of people with particular kinds of recreation also greatly affects 
their demand for it. For instance, if there are no swimming facilities available, 
there may be little apparent interest in swimming; but if one pool be built, 
demand may shortly rise to the point where three pools are needed. These 
various factors are in addition to the factor more usually related to quantity 
demanded—namely, the price of the good or service sold. Since much outdoor 
recreation is free, it is hard to relate the demand for it to its price. For other 
types of recreation, its suppliers often have fairly definite notions of what their 
customers will pay, as to movies, theater, athletic events, and other recreation 
activities. However, formal demand studies seem mostly to have been lacking. 

The demand for recreation, particularly that for different kinds of recrea- 
tion, is also affected by the amount and quality of the resources used for it, the 
annual inputs or the quality of the service provided to the patrons, to the sup- 
ply of competing forms of recreation, and to the satisfactions which users or 
patrons get from the particular kind of recreation. If there are ample and varied 
other types of recreation opportunity, then the demand for a particular kind of 
recreation may be rather limited. If there are many and attractive natural lakes 
within a reasonable distance, the demand for swimming, boating, fishing, and 
other activities on a new man-made lake will be materially less than if this is 
the only body of water for many miles. If a swimming pool and bath houses are 
well kept up and properly serviced, the demand for swimming will be much 
greater than if the place is slovenly. Superb waterfalls or other natural scenery 
may have such a stirring effect upon those which behold them as to offset in- 
conveniences of travel and lack of facilities, but more visitors will be attracted 
if the servicing facilities are good. To these examples, many others could be 
added, to illustrate the fact that the demand for recreation in general and more 
particularly for specific kinds is affected by many factors other than its direct 
or indirect costs. 

Sixth, and lastly, the impact of a recreational resource upon the local econ- 
omy is a matter which might well be studied. When an unusual natural area, 
such as Zion National Park, is first given national park status, has passable 
roads built to it, and gets a certain kind and amount of advertising, this draws 
to it a comparatively large number of visitors who spend substantial sums en 
route and in the local area, for various kinds of services. The same thing may 
happen when a large reservoir is built in a local area, especially if the lake 
behind the dam has few natural rivals nearby. Even when the natural recre- 
ation resource does not have this sudden sort of growth, but develops grad- 
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ually, there may be a major impact upon the local economy. The economy of 
Maine coastal towns is closely geared to the summer visitors, for instance. In 
each of these cases, a substantial part of the economic base for the locality is its 
recreation resource which outsiders enjoy. A form of interregional trade devel- 
ops, which enables the vacation area to buy goods and services from “outside.” 
In the case of Switzerland, vacations are a major earner of foreign exchange. 
A particular kind of recreation activity, such as horse racing, which draws 
crowds of people may have the same economic effect as an outstanding natural 
recreation area. Although the benefits of the recreation development are often 
widely diffused through the local economy, and difficult to measure, yet they 
may be large in total and highly important. The impact locally is most direct 
and most easily measured; regionally—that is, over an area of a state or more 
—the impact of a local recreational resource and its use is more diffused; and 
nationally it may be very hard indeed to measure what difference a particular 
local recreational resource makes." 


8. SOURCES AND EVALUATION OF STATISTICAL DATA ON RECREATION" 


Before considering in detail the different sources, it is possible to make three 
generalizations about most data on recreation in the fields we are considering: 

1. The available statistics were not collected for the purpose of social science 
research; they were often obtained as an incident to administrative actions, or 
to guide them, or for their current news value. it should not, therefore, be sur- 
prising that the available data are often unsatisfactory for social science an- 
alysis. 

2. Such statistics as are collected are not fully summarized. In several in- 
stances, as will be noted, estimates of varying accuracy are made currently but 
no one has brought these together into historical series. Work with original 
sources would in some cases lead to series of data that might be useful. 

3. Even such data as are available have been subjected to comparatively 
little social science analysis. 

Although the available data are unsatisfactory to the economist or other 
researcher, it must be pointed out that the various agencies have done the best 
job possible with the limited funds available for this purpose. The need for bet- 
ter data—indeed, the form which better data should take—has not been clear 
in the past; the great growth of the forms of recreation here described may well 
lead to more interest in the matter, with consequent ultimate improvement in 
the basic data. More and better data will almost certainly require the expendi- 
ture of more money for their collection, analysis, and publication; however, 
the sums involved wiil be small compared to the total cost of the recreation de- 
manded. Large improvements in data are possible for the various agencies, but 





3 A provocative but somewhat inconclusive study of this subject, which deserves a larger audience and a wider 
review than it has had, is The Economics of Public Recreation—An Economic Study of the Monetary Evaluation of 
Recreation in the National Parks, National Park Service, Department of the Interior (processed), 1949. Roy A. 
Prewitt, an economist with the National Park Service, formulated several memoranda and circularized a number of 
nationally-known economists whose replies are included. 

4 The scope of this section was determined by the general plan of Historical Statistics. Thus, radio, television, 
books, magazines, and newspapers are considered in the chapter on Communications. 
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a comprehensive picture will require data which single agencies acting alone 
cannot obtain. 

For the purposes of this discussion, recreation will be divided into (1) out- 
door recreation, including nonorganized sports, largely on public facilities, 
(2) organized sports, (3) other commercial amusements, and (4) expenditure or 
commodity data. 

Outdoor Recreation: Comparatively little data is available for outdoor recrea- 
tion on privately owned land, and data for publicly owned land is often not 
fully comparable from one type of area to another.’ The chief federal agencies 
providing outdoor reczeation on lands under their jurisdiction are: 

a. National Park Service, which administers the national park system. This 
system includes parks, monuments, battlefields, various other historical areas, 
some parkways and recreation areas, and other miscellaneous units. 

b. Forest Service, which administers the national forests. These lands are 
administered for multiple uses, of which recreation is but one. 

c. Bureau of Sport Fisheries and Wildlife, which administers the wildlife 
refuges, on many of which there are general recreation opportunities as well as ° 
the possibility of observing the various forms of wildlife. 

d. Bureau of Land Management, which administers grazing districts, the 
Oregon and California Revested Lands, and other miscellaneous areas, on 
which there are some but not many outstanding recreation opportunities. 

e. Corps of Engineers, which builds flood control, navigation improvement, 
and multiple purpose dams; on the reservoirs behind such dams there is often 
excellent recreational opportunity. 

f. Tennessee Valley Authority dams have created lakes with excellent recre- 
ational opportunities. 

g. Bureau of Reclamation dams have also created some outstanding recrea- 
tional areas, the larger of these are administered by the National Park Service 
or the Forest Service. 

Each of these agencies collects certain items of information about recreation 
on the lands it administers; the nature of these data is discussed below. The 
data are published in various mimeographed or other processed releases, or by 
means of new releases, mostly on an annual basis.!* Occasional reports bring 
together data for several years for a particular agency. 





% Marion Clawson, Statistics on Outdoor Recreation, Resources for the Future, Washington, 1958. 

16 Typical data releases (but not a complete list) of this category are as follows: 

Areas Administered by the National Park Service, a printed publication issued annually. 

Public Use—Tabulations of Visitors to Areas Administered by the National Park Service—published monthly 
during the summer season, quarterly during the rest of the year, each with cumulative totals for the year to date, 
and comparisons for the same period in the previous year; also ten year summaries published at intervals. 

Public Use of National Wildlife Refuges, a mimeographed statistical news release issued annually by the Fish 
and Wildlife Service. 

National Forest Wilderness Areas, a mimeographed release issued annually by the Forest Service. 

Recreation Use on National Forests, an annual mimeographed statistical news release issued by the Forest 
Service. 

Recreation Use of Civil Works Projects—Attendance and General Facilities, an annual mimeographed statistical 
release of the Corps of Engineers, Department of the Army. 

Reclamation’s Recreational Opportunities, a printed folder, and Reclamation Pays an Extra Dividend in Recrea- 
tion and Conservation, a processed report, each issued at irregular intervals by the Bureau of Reclamation, and each 
containing a few limited statistics. 
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These reports have certain characteristics which greatly limit their use for 
research on recreation. They are nearly all mimeographed or otherwise proc- 
essed, not printed; they usually consist of a few pages, sometimes of a single 
page, and almost never have hard covers; they do not comprise a numbered 
series, with volume and issue number, although they are mostly issued peri- 
odically ; and the data they contain have not, in general, in the past been pub- 
lished in well recognized and widely circulated statistical publications such as 
Current Business, Statistical Abstract, or Agricultural Statistics. As a result, 
these data releases are not preserved by most libraries—indeed, given their 
form, it would be hard to preserve them for ready and long continued use. These 
data are more available, in practice, to recreation specialists than to social sci- 
ence researchers; and the latter may find it difficult to have ready access to such 
reports over a long period of time. A student desirous of undertaking a research 
project in this field almost must go to the agency concerned in order to obtain 
its data. 

Until recently, data for all agencies for a number of years had not been sum- 
marized in one place.'’ Additional detailed data are in the files of the various 
agencies, particularly for particular parks, forests, wildlife refuges, or other 
administrative units. 

Areas usable for recreation, owned by the states, have a wide variety of 
names and are administered by a number of agencies, sometimes by several 
agencies within a single state. Some areas are called parks, or some name con- 
noting parks; but some are called forests, and others have other names. Some- 
times those called forests are as important for recreation as those called parks 
in another state. In some states there are separate departments, called State 
Park Departments or something similar; in other states, the recreation areas 
are administered by a Division of a larger Department, often one that includes 
forests or sometimes all natural resources. In some cases, recreation areas are 
administered by Highway Departments; this is particulariy the case for way- 
sides and other areas adjacent to highways and provided for the convenience of 
the motorist. In some states, the administrative agency is headed by a single 
officer; in others, there is a park or recreation Board. This diversity of areas 
and methods of administration naturally makes the collection of meaningful 
and comparable data extremely difficult. Fortunately for the sccial scientist 
the National Park Service has undertaken to collect annually certain data on 
expenditures, attendance, and other items relating to state parks, and to collect 
still other data to inventory the same areas and their improvements at five 
year intervals.'* These data are published in form generally similar to those of 
the various federal recreation agencies, previously described. As such, they suf- 
fer from the same general disadvantages. In addition, the significance of the 
data is impaired by reason of the fact that they relate to state parks, as defined 





17 Clawson, op. cit. 

18 These data come in two series: 

State Park Statistics, published annually by the National Park Service, in mimeographed form, and containing 
data on expenditures, sources of funds, attendance, areas and acreage, personnel, and anticipated expenditures for 
the next year. Data are presented in detail for one year, with only the barest summary data comparing previous 
years with the current year. 

State Parks—Areas, Acreages, and Accommodations, published at five year intervals by the National Park 
Service (1950, 1955, etc.), with more detailed information as to facilities at each date. 
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in the publications, but do not include state forests, wildlife areas, waysides 
under Highway Department administration, and other publicly owned areas 
on which there is a great deal of recreational use. Moreover, the extent of this 
difference between parks as defined and all state operated lands with recreation- 
al use is not constant from state to state. The National Park Service is depen- 
dent upon the voluntary cooperation of the state agencies which it asks to pro- 
vide data; there is some, but apparently a minor degree, of under-reporting, 
since a few agencies with minor responsibilities do not always provide data. 
The accuracy of the data is dependent upon the reporting states. 

Difficult and diverse as are the statistics for recreation in state owned areas, 
the situation is much worse for municipal and other local public recreation 
areas. After all, there are but 48 states, although there are more nearly 100 
separate state park administering agencies; but there are literally thousands 
of cities, as well as some counties, that have parks. Moreover, in these there are 
a diversity of names of areas and of administering agencies. Some data are col- 
lected by the Bureau of the Census annually, as part of its surveys of local gov- 
ernment and its cost. The National Recreation Association of New York, a 
private agency, has collected more information from these units of local gov- 
ernment, in some instances in cooperation with a federal agency. Beginning in 
1910 for many years it made annual censuses of municipalities, on such items 
as numbers of specified facilities, numbers of personnel, expenditures, attend- 
ance or use of certain facilities, and the like. As the number of cities with parks 
increased, this task grew more burdensome and beginning with 1942 censuses 
were made in alternate years until 1950; since then they have been at five year 
intervals. An attempt is made to include all cities of 2,500 population or larger, 
but the degree of response has been variable. Most of the larger cities report, 
many of the medium size ones, and a much smaller proportion of the small 
cities. There is reason to believe that a substantial proportion of those not re- 
porting do not have parks, but evidence on this point is lacking. The degree 
of under-reporting has not been constant. The results of these censuses have 
been published.!® 

In addition to data on recreation on certain kinds of publicly owned areas, 
there are also data on hunting and fishing on all ownerships of land, and data 
on outboard motors. All states license most hunters and fishermen, in nearly all 
cases differently for residents of the state and for nonresidents. Although infor- 
mation is lacking as to the precise requirements for purchase of licenses in each 
state over a considerable period of time, information is available as to num- 
bers of licenses issued in each state since 1941, when federal legislation provid- 





19 Beginning with the latest publication in this series, these are: 

1246 Recreation and Park Yearbook, National Recreation Association, New York, 1956; 

Recreation and Park Yearbook—Midcentury Edition—A Review of Local and County Recreation and Park De- 
velopments 1900-1950, National Recreation Association, New York, 1951; 

Municipal and County Parks in the United States 1940, National Recreation Association, New York, 1942; and 
other Yearbooks of the Association, published annually in earlier years and at less frequent intervals in recent years. 

Municipal and County Parks in the United States, 1935, George D. Butler, National Park Service and National 
Recreation Association, Washington, 1937. 

Park Recreation Areas in the United States, U. S. Bureau of Labor Statistics, Misc. Series, Bulletin No. 565, 
1932. 

Park Recreation Areas in the United States, U. 8. Bureau of Labor Statistics, Misc. Series, Bulletin No, 462, 
1928. 
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ing for grants in aid for wildlife purposes required the collection of these data 
from the states. They are compiled and published annually by the Bureau of 
Sport Fisheries and Wildlife (formerly the Fish and Wildlife Service).2° Sum- 
mary data are available for earlier years, and presumably records of some 
states would also be available for earlier years, from the state sources. In 1956, 
for the first time, over-all data on fishing and hunting, based on sample sur- 
veys, were available.” 

An active trade association collects and publishes data on sales of outboard 
motors, boats, and boat trailers.** Some data are available for as early as 1919; 
in recent years, data by states are available. 

The data on recreation in specific public areas, for hunting and fishing on all 
land ownerships, and for outboard boats, for use on public and private waters 
overlap to an unknown extent. That is, a person might fish on a lake in a state 
park and use an outboard motor, thus being counted three times. Much of the 
hunting and fishing is done on privately owned land, and outboard motors are 
used on private as well as on public bodies of water. These data therefore give 
some indication of recreation on private land. 

The data on outdoor recreation are highly diverse (Table 297).% Data on 
acreage used for recreation are available for most types of outdoor recreation 
where this is appropriate, except for privately owned areas. However, there is 
a major problem as to definition of areas included in the statistics—forests as 
well as parks, ete.—to which we referred above. Wayside areas under the 
jurisdiction of state park agencies are included, but wayside areas under state 
highway department management mostly are not. There is also a definitional 
problem for multiple use management areas such as the national forests; the 
area of improved campgrounds and other areas reserved primarily for recre- 
ation understates the case, since areas used primarily for other purposes greatly 
enhance the value of the specialized recreation areas. There is no classification 
of the areas on the basis of the intensity of their use; some are used intensively; 
others have intensively used portions and much larger buffer areas. The time 
period for which data are available varies greatly: for the national parks, from 
1850 to date, for the larger cities by selected years from 1880 to date, but for 
some types of areas primarily only for the present. 

Data on investment of capital in facilities and improvements, on either a 
historical basis, a historical basis depreciated and corrected for changes in price 
level, or a present replacement equivalent of investment, are almost wholly 
lacking. For the state parks and for a few other types of areas, it is possible to 
get estimates of the sums spent in the current year for capital investments; but 
these are not made part of a capital account. This deficiency is particularly 
serious because it is generally believed by recreation specialists that the invest- 





20 These are annual statistical news releases, in mimeographed form, by the Fish and Wildlife Service, Depart- 
ment of the Interior. The tables are headed Hunting License Statistics and Fishing License Statistics; the releases 
have variable headings. The earlier comments as to data on recreation on federal] lands apply here also. 

% 25 Million Sportsmen U.S.A.—National Survey of Fishing and Hunting, Circular 44, Fish and Wildlife Service, 
Department of the Interior, Washington 1956. 

% Outboard Boating Club of America, Chicago, Illinois. Although the association publishes (prints) an attrac- 
tive series of booklets and pamphlets on various aspects of outboa'd boating, its statistios are issued in mimeo- 
graphed releases annually. 

® Clawson, op. cit. 
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ments in roads, landscaping, facilities of all kinds, and for other purposes vastly 
exceed not only the original cost of the land but its value for alternative pur- 
poses, at least for a large proportion of the parks and other recreation areas. 
The general public often grossly underestimates investments in park improve- 
ments; this is one reason why so many proposals are made to use park areas as 
rights of way for highways. The land looks empty and unimproved. Data are 
often available as to numbers and capacities of campgrounds, and similar data 
on other physical improvements. For some areas, such as the TVA reservoirs, 
such data are in considerable detail. In general, however, such data have 
limited usefulness for comparisons between different types of areas over con- 
siderable periods of time, because definitions are not standardized and no 
measures of the quality of the facilities are included. 

Data on inputs into outdoor recreation are scanty. Information is available 
as to the expenditures of most public agencies administering recreation areas. 
However, when the agency manages the area for other purposes as well, as in 
the case of the Forest Service and the national forests, or when the agency has 
other activities also, it becomes difficult to estimate the true input for recrea- 
tion. Expenditure data do not always distinguish between annual inputs and 
capital outlays. The lack of capital accounts means that depreciation charges 
cannot be added to annual cash outlay. Very little data are available as to ex- 
penditures by private individuals for outdoor recreation; there are data on the 
cost of hunting and fishing licenses and the purchase of new outboard motors, 
but these are very minor in the total picture. There are also the estimates for 
hunting and fishing for one year. Some states have estimated expenditures for 
certain types of outdoor recreation, or by out-of-state visitors travelling within 
the state.*4 Aside from serious definitional problems, these data are also deficient 
in their spotty coverage and generally short length of series. 

Inventories have been made of existing outdoor recreation areas. This in 
itself may be no small job; the National Park Service finds it difficult, or im- 
possible, to get a complete inventory of state owned recreation areas, even 
when an attempt is made to have a carefully defined list of areas to include 
and exclude. The National Recreation Association has found it impossible to 
get a complete inventory of municipal recreation areas. One private organiza- 
tion has compiled and published a very useful inventory of public camping areas 





% See American Automobile Association, op. cit., and Curtis Publishing Co., op. cit.; Kenneth Decker, The 
Tourist Trade in California, Bureau of Public Administration, University of California, Berkeley, 1955. A special 
form of such studies has been the evaluation of wildlife resources and the measurement of expenditures on hunting 
and fishing. Robert F. Wallace, An Evaluation of Wildlife Resources in the State of Washington, Bulletin No. 28, 
Bureau of Economic and Business Research, State College of Washington, Pullman, Washington, 1956; O. N. 
Arrington and P. M. Cosper, The Economic Aspects of Wildlife Resources in Arizona, Arizona Game and Fish De- 
partment, Phoenix, 1953. Willis C. Royall, Jr., Wildlife Values with Special Reference to Idaho Wildlife as a Recrea- 
tional Resource, M.A. thesis, Cornell University, Ithaca, 1954. Nathan W. Fellows, Jr., Economic Importance of 
Fish and Wildlife, Project No. W-37-R-3, Maine Department of Inland Fisheries and Game, Augusta, 1954. Law- 
rence H. Couture, Seventy-four Million Dollars a Year Just for the Fun of It, Massachusetts Division of Fisheries and 
Game, Bulletin 14, 1954. Also with Sargent Russell, How Jt Was Done, supplement to Bulletin 14, Extension Service, 
University of Massachusetts, Amherst, 1954. David L. White, New Hampshire's 22 Million Dollar Sportsmen, 
Technical Circular No. 11-a, Management and Research Division, New Hampshire Fish and Game Department, 
1955; also Technical Circular 11 (a detailed description of techniques employed in the study). Howard J. Stains 
and Frederick S. Barkalow, Jr., The Value of North Carolina's Game and Fish, Game Division, North Carolina Wild- 
life Resources Commission, Raleigh, 1951. D. L. Leedy and Charles O. Damboch, An Evaluation of Ohio's Wildlife 
Resources; Wildlife Conservation Bulletin No. 5, Ohio Division of Conservation and Natural Resources, 1948. 
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only, with difficulty and with less than complete success.** Others have made 
similar compilations. Some areas have a rated “capacity.” Where the facility is 
overnight lodging, capacity is fairly specific; where it is camping or picnicking, 
capacity usually is related to the number of tables or other key facilities. But 
for some types of outdoor use, it is extremely difficult to state “capacity”; as 
the area grows more crowded, the quality of the recreation goes down. In any 
event, this kind of capacity is usually a peak one, not a season or an annual one. 

A few estimates have been made of potential recreation capacity in different 
aréas.”* Such estimates inevitably run into the matter of quality of the recre- 
ational site; if people are not too fussy, many areas not now used for recreation 
might be so used at some future date. Potential recreation capacity cannot be 
considered intelligently aside from potential future demand for recreation. 

A few formal attempts have been made to establish standards, especially of 
area, for different types of recreation, and as a matter of practice every recre- 
ation planner must have some such standards in mind.”’ Yet there is far from 
complete agreement as to the relationship between area and other attributes of 
natural resources, and the capacity of an area to supply recreation of a par- 
ticular type. The matter of quality enters again. If crowding becomes too 
severe, user satisfactions decline greatly and complaints arise. Moreover, 
crowding may lead to a physical deterioration of the site itself. 

The various kinds of public outdoor recreation usually have some measure 
of the volume of output, but very little or none as to its quality, other than 
that which is inherent in the standards governing the establishment of the dif- 
ferent types of areas. Volume of output is measured in three ways: 1) as num- 
bers of visits, each person who enters each area being counted each time (except 
that campers in an area may not be counted each time they go to the nearby 
but outside grocery store, for instance); 2) visitor days, or numbers of visits 
roultiplied by an estimated number of days spent within the area, and 3) actual 
visitors, or different persons, who patronize the area or type of area during a 
season. Each measure has its value; the difficulty arises when they are con- 
fused.28 When an outdoor recreation administering agency maintains an actual 
gate through which visitors must pass, perhaps paying a fee as they do, it is 
easy to count the number of visits. Some aspects of recreation area manage- 
ment are related to the number of admissions to the area. One deficiency of this 
measure is that it does not differentiate between the evening picnic group and 
the all-week camping group. Visitor-days does provide a measure of the latter. 
Where fees are based upon number of days in the area, visitor-days is easily 





% Campgrounds Unlimited, Campground Guide for Tent and Trailer Tourists, Blue Rapids, Kansas (issued an- 
nually). 

% One interesting and apparently highly competent recent plan is Plan 62—Development Program, by Lane 
County (Oregon) Parks and Recreation Commission (19577). 

27 George D. Butler, Introduction to Community Recreation, McGraw-Hill Book Co., New York 1949; National 
Recreation Association, Standards for Neighborhood Recreation Areas and Facilities, New York, 1943; National Rec- 
reation Association, Standards for Municipal Recreation Areas, 1948; California Committee on Planning for Recrea- 
tion, Park Areas and Facilities, Guide for Planning Recreation Parks in California—A Basis for Determining Local 
Recreation Space Standards. (Distributed by Documents Section, Printing Division, Sacramento 14, California.) 
In addition, all federal and state agencies engaged in park and outdoor recreation development have plans, manuals, 
or general informal standards to guide their layout of areas. 

28 Marion Clawson and Burnell Held, The Federal Lands, Their Use and Management. Johns Hopkins Press, 
Baltimore, 1957. See particularly p. 69-71. 
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determined as part of the administrative process; otherwise, it must be esti- 
mated. Where data are available for both numbers of visits and number of visi- 
tor-days, it is possible to calculate the average length of stay. For instance, the 
length of the average stay within national forests has decreased considerably 
since the war; this fact may have both administrative and social significance. 

Data on numbers of visits and numbers of visitor-days are often referred to 
loosely as numbers of visitors. If the latter is taken to mean the number of dif- 
ferent persons, this is obviously incorrect. It is incorrect for a single type of 
area, such as national parks or national forests; in recent years, the number of 
visits to each has been nearly one third as large as the total number of people 
in the United States, but the number of individuals who visited each is perhaps 
only one fourth as large as the number of visits. The absurdity of treating data 
on visits as though they were for visitors is apparent when-data for all types of 
public recreation areas are combined; including municipal, state, and federal 
areas, the number of visits is at least 10 times the number of people in the 
United States. 


TABLE 300 


AREA, OVERNIGHT CAPACITY, AND ATTENDANCE IN MAJOR 
KINDS OF PUBLIC OUTDOOR RECREATION AREAS! 








Present area Attendance 
(million acres) Over- 
night 1955 Average annual 

Primarily capacity! 4. 1956? percentage 
Total for (1,000 (mil- increase in 
recreation| PeTsons) lion) postwar years 








Kind of area 








Municipal parks ; ; ? 1,000 plus 4 
State parks ; : 195 201 
National park system 4. : 77 55 
National forests ‘ ‘ 400 53 
Federal wildlife refuges . ? 8 





TVA reservoirs , 2 12 40 
Corps of Engineers reservoirs ; 3 36 71 











1 Based on Marion Ciawson, Statistics on Outdoor Recreation, Resources for the Future, Washington, 1958. 
2 Most recent year of record; visits except wildlife refuges and TVA reservoirs, which are visitor-days. 


Although the purpose of this essay is to discuss sources of data and their 
potential use, and is not to present findings as such, yet a few data on acreage 
and attendance at major public outdoor recreation areas may help to provide 
perspective (Table 300). Municipal parks are smallest in area, largest in total 
use, much the heaviest in rate of use per unit of area, and apparently are grow- 
ing most slowly in amount of use. (Apparently, because the data are not com- 
plete and the possibility exists that there may be major omissions.) The total of 
federal areas has a somewhat greater use than the state parks, but spread-out 
over a very much larger area, and is growing at a faster rate. Particularly note- 
worthy is the rate of growth of recreation on water areas; in part, this is due 
to the large increases in area of water surfaces and the small or no increase in 





DATA FOR RESEARCH ON RECREATION 301 


area of land available for recreation. The rates of growth in use are all compara- 
tively high; even the municipal parks are increasing in use as fast as gross na- 
tional product. 

Data on numbers of different persons who use different types of public and 
private outdoor recreation areas would be extremely useful in various types of 
social science research on recreation. In addition to data on mere numbers of 
people, information on age, sex, family composition, profession or occupation, 
income, and residence of different recreationists would be valuable. Such infor- 
mation would give a measure of how widely each type of recreation was shared, 
and thus be helpful in appraising its effect on the population. It is also critical 
to the matter of how the costs of recreation should be borne—taxes versus ad- 
mission fees, primarily. Very few data exist on number of individuals visiting 
different types of outdoor recreation areas. It would be very difficult for a single 
agency to collect such data, even for the areas umer its jurisdiction. For one 
thing, until the year was over, users of such areas could not tell what other 
areas they would use, and then a memory bias would enter. Data of this kind 
could best be collected by research agencies on a sample basis, but covering all 
types of outdoor recreation areas. 

Although there are almost no data which directly measure the satisfactions 
which users obtain from different types of outdoor recreation areas, there are 
some data as to the kinds of activities which people engage in on each type of 
area. For instance, the Forest Service estimates how many of its recreation vis- 
its were for general enjoyment of forest environment, picnicking, fishing, hunt- 
ing, camping, winter sports, swimming, hiking and riding, and other purposes. 
Where the administrative agency has different types of units under its manage- 
ment, the number of visits to each type gives some measure of the popularity 
of each kind of activity. Thus, the National Park Service has data on numbers 
of visits to national parks, to historic sites, and to other units of the national 
park system. Attendance data for state parks are divided between day and 
overnight use, with some further details about the latter. When these data are 
supplemented by other information as to average length of stay for each type of 
activity, as is the case for the national forests, this permits some inferences as 
to the character of the use. But, in general, the amount and kind of the satis- 
factions which visitors obtain from different types of outdoor recreation has 
not been subjected to specific measurement. 

The demand for outdoor recreation, in the economist’s sense of the term 
demand, has had very little study. Recreation specialists and park administra- 
tors have often made advance estimates of the probable use of particular out- 
door recreation areas, sometimes with high accuracy, but often grossly under- 
estimating the potential demand. In so doing, they have taken into account the 
total population in the service area of the facility, transportation routes and 
travel times, the amount and character of possible alternative recreation 
sources, and the inherent quality of the recreation area itself. Some of these 
factors, as the last one, are subjective rather than objective, and have generally 
not been quantified. Nevertheless, they are very real. In thus estimating prob- 
able demand for recreation area, price or admission fee has generally not been a 
major factor. In part, this is because so much recreation is public, either free or 
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at nearly nominal! fees, and thus there is not a clear relation between the price 
and the volume of use. 

If meaningful studies are to be made on demand for recreation, it is essential 
first to classify outdoor recreation opportunities according to user groups, or 
at least to a combination of user and resource factors, rather than according to 
administrative factors as they are so often at present. One possible system of 
classification is given in Table 302. This classification scheme employs two 
major factors: location of the recreation users, and natural features of the areas. 
It would be possible to develop other schemes of classification that used only 
one of these two factors, or that used still other factors. Subdivision of classes 
would in any event be necessary. It seems probable that competition would be 
greatest between those types of recreation which could be used within a par- 
ticular time period, such as after work, and that would appeal to a particular 
age class, such as adult but not aged males. With the role of cost or price re- 
duced to such a subordinate position, the use of outdoor recreation areas is 
almost surely dependent on other factors, of which existence and location of 
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alternative recreation opportunities may be the most important. 


TABLE 302 
GENERAL CLASSIFICATION OF OUTDOOR RECREATIONAL 


USES AND RESOURCES 








Type of Recreation Area 





Consumer oriented 


Resource based 


Intermediate 





1. General location 


2. Major types of 
activity 


3. When major 
use occurs 


4. Typical sizes 


5. Common types 
of agency re- 
sponsibility 





Close to users; on 
whatever resources 
are available 


Games, such as golf 
and tennis; swim- 
ming; picnicking; 
walks and horse rid- 
ing, zoos, etc.; play- 
ing by children 


after hours (school 
or work) 


One to a hundred, 
or at most to a few 
hundred, acres 


City, county, or oth- 
er local government; 
private 


Where outstanding 
resources can be 

found; may be dis- 
tant from most users 


major sightseeing; 
scientific and his- 
torical interest; hik- 
ing and mountain 
climbing; camping; 
fishing and hunting 


Vacations 


Usually some thou- 
sands of acres, per- 
haps many thousands 


National parks and 
national forests pri- 
marily; state parks 
in some cases; pri- 
vate, especially for 
seashore and major 
lakes 








Must not be too remote 
from users; on best re- 
sources available with- 
in distance limitation 


camping, picnicking, 
hiking, swimming, 
hunting, fishing 


Weekends 


A few hundred to sev- 
eral thousand acres 


State parks; private 
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Only a comparatively few studies have been made of the economic impact of 
a recreational resource on a locality, a state, or a region. The National Park 
Service, in cooperation with the Bureau of Public Roads and State Highway 
Departments, has made sample studies for some national parks.*® Data are 
collected from sample of visitors on such matters as their state of origin, the 
length of time on the trip, the days spent in the park, number of people in the 
party, expenditures per day or per trip, and the like. Such data are highly use- 
ful. Their value is limited by the small number of such studies and their lack 
of continuity, as well as by doubts as to the randomness of the samples studied. 
A more difficult matter is to appraise the effect of the park as such. That is, 
people taking a long trip may well visit a national park en route; but was the 
visit to the park strictly incidental, or one of several reasons for the trip, or 
the basic reason for it? If the latter, how far away from the park does the 
economic impact reach—back to the point of first expenditure after leaving 
home[ The problem is difficult enough for a major recreational development 
relatively isolated from other activities, such as are most national parks. But 
it becomes much more complicated when it is a state park in the midst of much 
other economic activity, or only the improvement of such a park, which is 
under study. Clearly, a greatly improved local park will draw some additional 
visitors and they will spend some additional money locally; but how much? 

Organized Sports. As one examines statistics on sports, whether participant 
or spectator, one concludes that sportsmen are not economic statisticians, or 
economic statisticians are not sportsmen. There is a great deal of popular 
interest in sports, as a look at any large city newspaper will testify; in the 
largest newspapers, several pages are devoted daily to the current sports scene. 
There are numerous specialized magazines given over to one or a few closely 
related sports. There is no lack of statistics on such matters as batting averages, 
yards gained from scrimmage, and the like; but an almost complete lack of 
dependable statistics on attendance, expenditures, personnel employed, and 
the like. The news stories that appear in these publications often contain figures 
on the attendance at particuiar events or as to the salaries of certain outstand- 
ing performers, and the like. The sceptical statistician cannot but doubt the 
accuracy of many such figures; they are so often rounded off, probably uni- 
formly upwards, and there may well be many temptations for exaggeration and 
few restraining influences. However, with few exceptions, even these data are 
not summarized, grouped into statistical series, and published. It might well 
be possible to do so, were one willing to spend the time necessary for perusal 
of daily, weekly, or monthly issues of various periodicals; but it might also 
prove impossible, for attention is focused on the unusual and outstanding, with 
possible neglect of the more ordinary events. There are some exceptions to 
these generalizations, as will be brought out later; in general, reasonably ade- 
quate statistics for social science research exist where an association has taken 





29 Arizona Highway Department, U. S. Bureau of Public Roads, and National Park Service, Grand Canyon 
Travel Survey, (1955?); Montana State Highway Commission. U. S. Bureau of Public Roads, and National Park 
Service, Glacier National Park Tourist Survey, 1951; California Division of Highways, U. 8. Bureau of Public Roads, 
and National Park Service, Yosemite National Park Travel Survey, 1953; National Park Service, Vacation Survey 
Rogue River Basin, 1950; and Virginia Department of Highways, U. 8. Bureau of Public Roads, and National Park 
Service, Shenandoah National Park Tourist Survey, 1952. 
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the pains to collect the data. Statistics of another kind have taken major 
attention by sports enthusiasts—records of unusual performances, or of un- 
usual participants. The Encyclopedia of Sports, for instance, gives major records 
for virtually every sport; while some attendance data are given, their coverage 
is limited and they are suspect as to accuracy. The Official Encyclopedia of 
Baseball boasts that it includes a record of every player ever to appear in a 
major league game, yet its data on attendance, total salaries, and other items 
that would be of interest to the social scientist are distinctly limited. 

Although all types of sports require some area of land for their enjoyment, 
very little information has been collected about the areas involved. Frequently, 
sports fields are included in parks or are part of educational layouts, and thus 
the data on their area are included there. The total area of land required for 
all sports activities is small compared to that used for other outdoor recreation; 
but its location often causes it to have high values. The investment in grand- 
stands, playing field improvements, and other aspects of sports areas must 
be large, yet we have been unable to find any systematic data on it. Again, the 
location of sports areas in parks or on school grounds would make it difficult to 
compile data for sports areas alone. 

There would be difficult problems of defining “inputs” into sports. Expendi- 
tures for sports equipment would surely be included; these we consider later. 
Salaries and other payments to professional athletes, coaches, and others di- 
rectly involved would surely be included also. We have found no series of data 
containing this information. (Presumably payments to “amateur” players, 
such as college football players, is top secret information.) The time of amateur 
players and unpaid officials presumably would not be counted, because unpaid. 
Yet in a larger account of national life, the efforts of volunteer workers is highly 
important. Because data do not exist, and presumably have never been sought, 
the definitional problems have not arisen. 

Particularly difficult and interesting problems would arise, were one to try 
applying the concepts of a supply curve to sports. There clearly is a supply of 
professional sports, for spectators. The volume supplied is a function primarily 
of the demand, it would seem; a far larger supply is potential at all times 
though perhaps “quality” would suffer if supply were increased. It is doubtful 
how many athletes choose to become athletes because of any careful calculation 
of probable earnings; in some lines, as in prize fighting, they may be attracted 
by the grand prize more than by any realistic estimate of their probable life- 
time earnings. For the participant sports, supply equals demand in a peculiarly 
direct way. The man who demands a round of golf supplies it! Each might be 
higher, were the course more attractive or less crowded or the amount of leisure 
greater. At any rate, the techniques of supply analysis have not, as far as we 
can learn, been applied to sports, whether spectator or participant. 

It is data on output of sports, or attendance, that are most numerous, and 
yet highly unsatisfactory for social science research. For current news, in daily 
newspaper or in weekly news magazine, data on attendance at particular foot- 
ball games, racing meets, etc. are published. One suspects considerable error 
in such data. Experience over a season is sometimes summarized at the end of 
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the season. The New York Times often carries such summaries, at the end of 
the year. But the emphasis is upon the recent, the latest, experience, and only 
rarely are data for more than one year published in such vehicles of public 
information. 

However, for some types of sports, data on admissions or attendance are 
available, and are published, either as statistical series or in reasonably com- 
parable form from year to year. Some of the best data are available for base- 
ball. The American League and National League Service Bureaus compile at- 
tendance data, which are published in various places—the World Almanac pub- 
lished annually by the New York World Telegram and The Sun is one of the 
most convenient sources, but the New Encyclopedia of Sports, annual issues of 
the Information Please Almanac, and other publications carry these data. Pub- 
lished data cover the period from roughly the first World War to date, unless 
one goes to early issues in these publications. The data cover the major leagues, 
including the world series, reasonably well, but there are very few data on 
minor league and semi-pro league attendance. The American Bowling Congress 
collects and publishes data on number of teams, numbers of leagues, and other 
relevant participant information; these are not strictly attendance data, be- 
cause they do not include spectators, but they do provide an index as to total 
participation. These data begin in 1896 and extend to the present. The Thor- 
oughbred Racing Association collects data on attendance, numbers of racing 
days, number of races, and numbers of horses raced; these data are published 
in various places, a convenient source being the Encyclopedia of Sports, 1953. 
This same publication, and the one for 1947, also give data on pari-mutuel 
betting, and data for horse racing in some states, particularly in New York. 
These sources also contain some data on trotting horse races, which were com- 
piled by the U. 8. Trotting Association. 

At the other extreme, there are no complete data on attendance for a number 
of major sports. The Ring, “World’s Foremost Boxing and Wrestling Magazine,” 
advises us that some states estimate attendance at boxing and wrestling 
matches, as part of the collection of admission taxes, but that not all states do 
so; and that such data as are available are published in their magazine—which 
is very little. The National Collegiate Athletic Bureau advises us that com- 
prehensive data on attendance at college sports events have never been com- 
piled; the National Collegiate Athletic Association had a television committee 
which printed reports in 1955 and 1956, with limited football attendance data 
for recent years. The National Federation of State High School Athletic Associ- 
ations has provided us with some statistics on admissions to certain types of 
high school sports; these data are clearly rough approximations, with little or 
no historical length. The American Kennel Club publishes the American Kennel 
Gazette, in which appear some data as to numbers of different breeds of dogs 
entered in shows in recent years, but not attendance data. In the New Encyclo- 
pedia of Sports, 1947 there are given attendance figures for basketball, bicycle 
racing, corn husking, dog racing, golf, motor cycling, polo, rodeo, roller skating, 
skiing, softball, swimming, track and field meets, and automobile racing; these 
data mostly relate to 1946, with little or no historical series. Since the data are 
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all rounded heavily, since comparable data for other years are not given, and 
since the sources are not always clear, it seems obvious that these data are the 
roughest possible approximations. 

The quality dimension of the output of sports has had little study. Certainly 
nearly everyone enjoys participating in some sports more than in others, and 
in watching some more than others; to what factors these differences in in- 
terest may be related, is not clear. Interest of the family and of others in school 
or in the neighborhood may well be a major factor. The degree of participation, 
as a spectator or as a participant, probably depends to a major degree upon 
convenience of the place of participation, quality of the sports area (especially 
for participants), skill and rivalry of the competing parties (for spectators), 
and other factors. There is evidence that many participants experience strong 
emotional reactions from sports; and some spectators do also, vicariously. It 
has been argued that boys and girls learn important lessons of democracy and 
of social organization and activity from sports and other recreation.*® But, as 
far as we know, no attempt has been made at a comprehensive evaluation or 
scheme of analysis of the satisfactions arising out of sports. 

Likewise, almost no formal demand studies have been made for sports ac- 
tivities. One would expect factors other than price or cost to be dominant. In 
the case of spectator sports, the popular interest aroused over contests would 
seem to far outweigh the admission price, as a determinant of demand. Yet, 
attendance is not independent of price, and it seems probable that sports’ 
promoters have fairly definite ideas of a demand curve—of a price at which 
available facilities may be filled. There is a considerable degree of inflexibility 
in prices; baseball games, for instance, may have the same prices fixed for the 
entire season, and differences in demand arising out of popular interest in con- 
tests takes the form of greater or smaller attendance. As to participant sports, 
the cost of engaging in the sport may be a major factor in its popularity; but 
the actual fee or other direct cost for the privilege of play may be but asmall 
part of total cost. Equipment, travel cost, club dues, and other factors may be 
far larger; and availability of free time for play may be most important of all. 

It would be most difficult to measure the impact of sports activities on the 
total economy of an area. A major football game may draw thousands of visi- 
tors to the college town for the week end, or the World Series may draw many 
out-of-town visitors. It would be possible to make some measurements of the 
impact of such sports activities. While some very rough estimates have been 
made as to total spending of visitors under such conditions, no careful, formal, 
and long-continued measurements have been made, nor has the dispersion of 
this spending within the local economy been measured. In general, sports are 
enjoyed locally—that is, by people who do not travel far from their homes for 
this purpose. This is especially true for participant sports, but is also true to 
a major degree for spectator sports. Under this circumstance, the impact of 
sports activities upon the local economy may be comparatively small—not 
comparable to the effect of a major tourist attraction which draws visitors pre- 
dominantly from outside areas. Of course, in a few instances, sports are the 
attraction which draws people to an area. 





) Luther Halsev Gulick. A Philosophy of Play, Charles Scribner’s Sons, New York, 1920. 





DATA FOR RESEARCH ON RECREATION 307 


Commercial Amusements. The line between spectator sports and commercial 
amusements is not always clear. In the discussion above, we have put horse 
racing, dog racing, bowling, and some other items into spectator sports, when 
with perhaps equal logic they could have been considered commercial amuse- 
ments. As the term is used here, commercial amusements include chiefly the 
movies, theater, concerts, and similar activities. Several characteristics define 
this group: the recreation is usually provided primarily for profit, with cultural 
values, if any, secondary; it is provided by a promoter or business interest 
group, usually; it must have popular patronage to survive; the patrons take a 
passive, or being-entertained, role; and the activities are paid for directly and 
in cash by those enjoying, with little if any indirect cost by the patrons in 
addition to the direct cash cost. These activities are predominantly indoor, 
not outdoor; and by definition they are private, not public, in their origin. 

Commercial amusements, as thus defined, take little land and other natural 
resources, directly and at the point of use. Indirectly, through income demands 
and in the production of films and other materials, they may use more land. 
Although the area of land required is very small, yet the land suitable for this 
purpose must be strategically placed, and its values are high. As far as we 
have been able to ascertain, there are no data on the area of land used for this 
purpose. Large amounts of capital are required for this type of recreation, both 
directly at the point of use, and indirectly in the production of materials. 
Specialized buildings, often costly, are required, as well as other specialized 
equipment. It seems probable that data exist on investment in commercial 
amusement facilities of different kinds, but no such data in readily summarized 
form, covering a period of years, were discovered. 

The annual inputs into commercial recreation, as here defined, must be 
rather large. There would be an annual capital charge, based on interest and 
depreciation of the capital assets; labor charges, both directly in making the 
recreation available to users and indirectly in production and distribution of 
films and other materials; and other current inputs. Census data show numbers 
of people employed in some of these activities, for instance in motion pictures. 

The supply schedule for commercial amusements is presumably rather elas- 
tic, although, as far as is known, no formal studies of this type have been made. 
If the demand existed for a far larger supply of movies, for instance, presuma- 
bly the volume of output of the motion picture industry could be greatly ex- 
panded, and perhaps without any or major increase in cost per unit of output. 
The matter of the “quality” of the output might be different; difficult as it is to 
measure, presumably it would decline if volume rose rapidly. 

The volume of output of commercial amusements has been measured. Data 
on admissions have been calculated by the Motion Picture Producers Associa- 
tion, and are published in the 1956 Yearbook of Motion Pictures; other data, not 
exactly the same, are collected and published by a private organization, Sind- 
linger and Co., Inc., Ridley Park, Pennsylvania. Since excise taxes are paid on 
admissions, data on excise collections, as published by the Bureau of Internal 
Revenue in its annual reports, are valuable as indexes to total admissions. 
However, the rate of tax has varied from time to time, which necessitates major 
adjustments, and the rate varies somewhat according to prices paid, thus re- 
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quiring a minor adjustment. In recent years, average weekly attendance at 
movies has been about 45 to 50 million, and annual gross receipts from film 
theaters somewhat in excess of $1 billion. Although attendance figures at 
theaters and other types of recreation included here are sometimes quoted in 
newspapers, no statistical series of them has been assembled, as far as could 
be determined. 

The movie industry and others have made studies of attendance.*! While such 
studies do not measure the “quality” dimension of output directly, they do 
throw some light on which age groups go to movies, and what they seem to 
get out of them. While many people are only mildly entertained by movies and 
the theater, others find them deeply emotional in effect; how these emotions 
compare with those aroused by sports, both participant and spectator, and by 
outdoor recreation of various kinds, has not been studied, as far as we are able 
to ascertain. : 

Formal demand studies for commercial amusements have not been made, as 
far as could be determined. As in the case of other types of recreation, it seems 
doubtful if price is the major factor affecting demand. The availability of alter- 
native recreational or other activity outlets, at the same time and for some- 
thing the same costs, may be far more important. The impact of television 
upon the movie industry is an example; when people could have similar enter- 
tainment in their own homes, at a marginal cost of nearly zero (once the in- 
vestment cost of the television set had been met) and without the necessity of 
getting a baby-sitter, they did not go out to the movies in anything like the 
same numbers. This is not to say that price is negligible in importance; probably 
at some level, customers would rebel. Moreover, there is a high degree of 
rigidity in prices; when an extra good movie comes to town, most show houses 
do not raise their prices, but differences in demand are reflected in numbers 
attending. As in the case of some other types of recreation, it seems probable 
that operators in this field have fairly well defined notions of demand schedules 
—of how high prices may be and yet draw desired volumes of patrons. But 
such notions are not formalized into demand studies. 

The economic impact of commercial amusements upon the economy of a 
town or community is especially hard to measure. In general, this type of recrea- 
tion is one which follows and adapts itself to the numbers of people and their 
wishes rather than one which has independent economic strength to affect in- 
come and employment. The absence of adequate commercial amusements, as of 
other recreation, would be considered a serious shortcoming of any area, by 
those considering moving there; but as long as private business is free to estab- 
lish movie houses and similar entertainments, this type of commercial amuse- 
ment will likely develop as fast as a market for it is evident. 

Commodities Used. The previously presented data on expenditures on recrea- 
tion, however, it may be defined, also include data on some commodities used 
for this purpose, as well as some services. In the field of sports, commodities 





*% Benjamin B. Hampton, A History of the Movies, Covici, Friede Publishing Co., New York, 1931; Martha 
Wolfenstein and Nathan Leites, Movies: A Psychological Study, Free Press, Glencoe, Illinois, 1950; “The Motion 
Picture Industry,” Annals of the American Academy of Political and Social Science, Vol. CCLIV, 1947; Leo C. Rosten, 
Hollywood—the Movie Colony—The Movie Makers, Harcourt, Brace and Company, New York, 1941. 
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are especially important; various kinds of sports equipment involve substantial 
outlays of money. The basic data for these expenditures comes from the Census 
of Manufactures. Total output data, in monetary terms, and some data on 
physical output of specific items, are available. By working with trade associa- 
tions, and possibly with the larger individual manufacturers, it might be possi- 
ble to get more detailed data, by specific items of equipment. This has not been 
attempted for the present purpose. 

Some, but not all, of the types of analysis suggested, could be applied to ex- 
penditures for these purposes. The manufacture of such equipment takes re- 
sources, particularly of capital, and involves annual inputs of labor and ma- 
terials. There might be some form of a supply curve also, although presumably 
supply could be highly elastic if demand warranted. The output of sports 
equipment and other commodities used for recreation is the aspect most likely 
to warrant study, and on which data presumably could best be assembled. 
Demand studies, in the formal sense, have not been made, as far as could be 
ascertained. Presumably, demand is critical to production and distribution of 
these commodities. ‘ 


4. RESEARCH POTENTIALS IN RECREATION 


The foregoing discussion will have served its purpose if it conveys the im- 
pression to social scientists that there is a promising field for research in recrea- 
tion. It is a “growth industry,” which will require increasing public attention 
in the decades ahead; it is a largely undeveloped field, where pioneering studies 
would seem to be possible. Some largely unexploited sources of data exist, and 


others could be developed. This combination of factors would seem to offer 
great possibilities to enterprising researchers looking for new opportunities. 
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Principles of Statistical Analysis. Samuel B. Richmond, New York: The Ronald Press 
Company, 1957. Pp. xii, 491, $6.50. 


Ruvotr J. Freunp, Virginia Polytechnic Institute 


ANY recent textbooks in statistics for business and social science students havs 
M recognized the increasing importance of statistical inference by adding chapters 
on this subject at the end of the book. This procedure does not necessarily imply in- 
creased attention to the teaching of this subject in statistics courses since it simply 
makes the book longer and thus increases the probability that the entire contents of 
the book are not covered during the course. 

Samuel B. Richmond in his book, Principles of Statistical Analysis, has performed 
a turn-about which may prove quite satisfactory. The book starts with nine chapters, 
which is about one-third of the book, on statistical inference. This part of the book 
is very well-written and concise, and is a clear exposition in elementary terms of the 
basic concepts and methods of statistical inference. Many of the explanations of 
new topics provide the student with an excellent background for the use of the 
methods that are given. The main faults with this part of the book are its extreme 
brevity on some topics and the fact that some of the exercises in the early chapters 
seem to presume more knowledge than is contained in corresponding chapters. These 
faults, however, may work to an advantage for those who prefer to use a text as a 
reference for what is taught in class. 

The second section of the book is on descriptive statistics and contains a con- 
glomeration of useful devices for statistical computation and presentation of statisti- 
cal data. It seems to this writer that such material is best covered in a laboratory. 

The final section of the book contains techniques used in forecasting. The tech- 
niques discussed are dependent entirely on the analysis of trends and fluctuations in 
time series, and it is this part of the book that is least useful from an over-all point 
of view. It would appear that a more general exposition of statistical methods for 
business and economics would have been more profitable. For those who want to teach 
this type of analysis, the material that is contained is again presented in a very clear 
and lucid manner with a great many timely warnings of the possible misuses or mis- 
interpretations that are so easily made in statistical analysis of time series. 

In conclusion, this book seems to represent a new approach to teaching business 
statistics. Unfortunately, it falls short by its dependence on traditional time series 
analysis in the forecasting section. However, for those who wish to teach principally 
the use of time series for economic forecasting in an elementary course, the use of 
this book can certainly be recommended. 


A First Course in Statistics. Robert Loveday. New York: Cambridge University Press 
1958. Pp. ix, 121. $1.75. 


R. Cray Sprow1is, University of California (Los Angeles) 


HIs book is an introduction to descriptive statistics. In short chapters with large 
here of numerical examples and exercises it covers the following topics: fre- 
quency distributions, cumulative frequency distributions, averages, dispersion, re- 
gression (hand-fitting), correlation (product-moment), rank correlation, times series 
(moving averages), weighted averages, and miscellaneous topics. An additional chap- 
ter is devoted to thirty miscellaneous exercises. An appendix includes both answers 
to those exercises which require numerical calculations and a glossary of terms. 


315 
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Loveday’s material was organized for and has been taught to “boys and girls” 
who are preparing for their university entrance examinations in the English grammar 
school. The equivalent levels of instruction in the United States are university fresh- 
men and junior college courses. Since descriptive statistics is taught as the first course 
(or the first part of almost every course) in statistical methods, this book appears to 
be an alternative choice to many others currently in use. I would recommend the 
book for those instructors who believe that an introduction to statistics should be 
simple, numerical, and practical. For those who believe that even an introduction 
should be analytical, this book will not suffice. 


Economic Models: an Exposition. E. F. Beach. New York: John Wiley & Sons, Inc., 1957. 
Pp. xi, 227. 
Joun Meyer, Harvard University 


N IMPORTANT, though limited, recent trend in undergraduate economics training 
has been the development of introductory courses in analytical methods that 
attempt to integrate instruction in the basic essentials of mathematical model build- 
ing with an introduction to the elementary statistical or econometric techniques 
needed to test these models empirically. Unquestionably, one of the more important 
factors limiting this development has been the lack of any satisfactory textbook. 
Part of the needed material has been excellently presented in several earlier texts— 
the most notable example being Baumol’s Economic Dynamics—but no attempt has 
been made to survey more or less the whole field. The lack of such an “overall” text 
is, of course, at least partly due to the obvious and substantial difficulties involved 
in writing such a book. 

Therefore, in undertaking the task of filling this void, Beach has set a difficult chore 
for himself. He has, moreover, compounded the problem by simultaneously under- 
taking a second goal: presenting a non-mathematical but accurate exposition and 
critique of econometrics and mathematical models that will be useful to professional 
colleagues without mathematics training. 

The critical question, of course, is: To what extent have these goals been met? 
As might be expected, the fairest assessment would seem to be that both goals have 
been at least partially fulfilled while neither has been completely satisfied. 

Those using the book for teaching purposes will find, for example, some of the 
chapters reasonably self-contained and complete (like those on “Elements of Model 
Construction,” “Linear Models,” and “Multiple Relations”). Other chapters require 
both supplemental reading and extensive classroom explanation, the presentation on 
“Sequence Models” needing the most amplification. Furthermore, Beach has a tend- 
ency sometimes to belabor the simple and obvious and to be markedly elliptie when 
discussing the complex. Of course, as every teacher of elementary theory or statistics 
probably has felt at some stage in his own experiences, this is almost inherent in the 
nature of the problem. With the simple problems there is more of a common concep- 
tual base for establishing communication and, consequently, the temptation is strong 
to elaborate fully. Contrarily, when dealing with complex concepts, the easiest solu- 
tion often is to touch on only the more important points in the hope that the better 
students will be inspired to do additional reading while the poorer students at least 
will be made no poorer by the exposure. The instructor, however, conventionally 
looks to the textbook to help him with this problem. Indeed, such aid, though difficult 
to provide, might be defined as a principal justification for any textbook. 

This difficulty of finding a proper expositional balance between the simple and the 
difficult is also likely to bother the non-mathematical professional reader. The very 
thorough discussion of the elements of model construction will be superfluous for this 
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reader while the very sketchy discussion of sequence analysis probably will prove 
frustrating. In addition, the professional, though unskilled in the more advanced 
econometric techniques, is likely to raise some very skeptical questions about the per- 
tinency, and perhaps even about the accuracy, of some of the comments made about 
the applicability of probability models in economics. At the very minimum, these 
extremely controversial points demand a very thorough defense and treatment if 
their presentation is to be useful to the professional reader. Beach might have found 
the space for this expanded treatment, moreover, simply by dropping some of the 
more tangential, and often somewhat misleading references to certain specialized 
problems in econometric testing; a prime example of where a useful deletion could 
have been made is to be found in the discussion of “regression slippage” on pages 150 
through 152. 

In sum, Beach does not provide quite as much textbook help as might be desirable 
in an introductory course in economic analysis and does not answer all the questions 
that the non-mathematical professional is likely to ask about econometric models 
and methods. But these are minor, not major criticisms. Most of Beach’s critical 
evaluations are remarkably well-written and should prove interesting, even though 
not always convincing to both the econometrician and the non-mathematical econ- 
omist. 


Distributed Lags and Demand Analysis for Agricultural and Other Commodities. Marc 
Nerlove. Agricultural Handbook No. 141, Agricultural Marketing Service, USDA, 1958. 
Available Supt. of Doc., U. 8S. Government Printing Office, Washington 25, D. C. Pp. 121, 
$0.60. Paper. 


Ben C, Frencu, University of California (Davis) 


N THE words of the author “this publication summarizes available literature on the 

use of distributed lags in the analysis of demand for individual commodities and 
contributes a substantial amount of new material to the problem of estimating dy- 
namic demand relationships.” 

Distributed lags arise when the effect of some causal factor is spread over a period 
of time. For example, a change in price of a commodity may produce changes in rates 
of consumption that are fully realized only after the lapse of some time. The lagged 
distribution of effect is a result of (1) technological and institutional rigidities or 
(2) psychological factors which involve uncertainty as to the permanence of various 
changes. Somewhat different analytical problems arise depending on which of these 
forces is in effect. 

Following the introductory discussion, a section of the report is devoted to review- 
ing empirical analyses in which distributed lags have been used. The next three sec- 
tions consider models for generating distributed lags based on (a) technological and 
institutional rigidities, (b) uncertainty about the future, and (c) combinations of 
both. Major portions of these sections are devoted to problems and methods of ob- 
taining reduced equations—equations derived from or related to the equations in- 
volving distributed lags, but which do not themselves involve distributed lags. 
Reduced equations may be estimated statistically more easily than the parent equa- 

P ° ° . ? 
tions that involve distributed lags. 

Following sections discuss methods of statistical estimation of equations involving 
distributed lags, both directly and by using reduced equations. Included is a con- 
sideration of problems and effects of serial correlation and interdependency of varia- 
bles within a system. 

Final sections are devoted to a consideration of the demand for durables and the 
implications of Friedman’s permanent income hypothesis for demand analysis. Al- 
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though they are clearly related to the concept of distributed lags, these last sections 
seem to dangle somewhat from the main body of the report. 

Readers without a fair amount of training in mathematics and statistics will have 
some difficulty with this report. It is intended largely as a handbook or reference work 
for people actively engaged in or having a detailed interest in statistical analysis of 
demand. Emphasis is on theory and methodology rather than on examples of appli- 
cations. 

The use of distributed lags in demand analysis appears very promising. However, 
computational difficulties and unsolved problems of identification and introduced 
serial correlation would seem to limit current use to relatively simple cases. Sim- 
plicity depends not only on the number of variables and equations involved, but 
also on the assumed source of distributed lag. Models involving technological and 
institutional rigidities can be handled with much greater facility than models based 
on rigidities of an expectational nature. 

The author states that “it is hoped that this publication will result both in a 
stimulation of additional research on methodology and in increased use of this tech- 
nique in applied areas.” It seems very likely that his hopes will be realized. 


The Carolina Economy: Resource Chartbook to the Future. University of South Carolina, 
Bureau of Business Research. Columbia: University of South Carolina, 1958. Pp. 52. $1.00. 
Paper. 


Purtie Bourque, University of Washington 


His publication, essentially a chartbook as its subtitle indicates, compares South 

Carolina, the South Atlantic region, and the United States with respect to 25 
time series relating to population, income, employment, production, construction, 
and retail sales. The historical variables, garnered from a variety of sources, are gen- 
erally presented quinquennially from 1920 and annually from 1945. The emphasis 
of the chartbook, and its possible significance to readers, lies in the projections of the 
time series. “The concern is to point out possible trends rather than to isolate cyclical 
variations from the trends.” In the reviewer’s opinion the projections, obtained by 
extrapolating mathematical curves fitted to a poorly chosen period, are of doubtful 
value. 

To assess the projection of each series individually would serve little purpose. Ob- 
viously, an appraisal of the projections must run in terms of assumptions and tech- 
niques, for we cannot here employ Milton Friedman’s rule that a test of a model is 
its predictive success. The study is vulnerable to several serious criticisms: (1) The 
trends are fitted to the nine years between 1945 through 1954 to project ahead for a 
period of 20 years; such a span of experience is too brief for so extended a projection 
of series like these. The authors acknowledge that “few of us will agree that the events 
between 1955 and 1975 will unfold in exactly the same degree and form as they did 
from 1945 through 1954”; few, I should think, would find it plausible to utilize the 
war and postwar years of 1945-46 as the initial years for the fitted trends. (2) The 
choice of the form of trend line to fit a set of observations is a matter of judgment, but 
the use,of a linear least-squared trend with the implication of a declining rate of 
increase for every series seems unwarranted. (3) All extrapolations are computed 
directly from the individual series with no attempt to maintain consistency among 
the different projections; for example, the projection of total population and total 
income imply a very different per capita income from that indicated by the projection 
of per capita income directly. 

Occasionally there is an attempt to describe the controlling factors of change in 
the well-written descriptive account of each series, but the study could have been 
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improved if the projections had been organized around an explicit model of causation, 
interaction, and development. The publication has little to merit the attention of 
either economists or statisticians, except possibly as a guide to sources of statistics 
(the footnoting appears impeccable), but may appear attractive to the nonprofes- 
sional who is not too discerning about the methods of trend projection. 


Industrial Output of the Ukraine, 1913-1956: A Statistical Analysis. Vsevolod S. Holubny- 
chy. Munich: Institute for the Study of the USSR, 1957. Pp. ix, 63. 


WarREN Notter, University of Virginia 


| peirtary 1938 and 1956 virtually no statistics were published on output of indi- 
vidual Soviet industries. The best information one could find appeared in obscure 
announcements couched in terms of percentage changes from some ambiguous base. 
Study of the Soviet economy became an exercise in archeology. With an enormous 
expenditure of time, the researcher might succeed in unearthing fragments of this 
and that, which he or somebody else might then be able, with enormous expenditure 
of time, to piece together into a set of intelligible, if unreliable, figures. This little 
statistical abstract presents the results of such an effort. 

Unfortunately, Mr. Holubnychy’s labor has been iargely put to waste by an official 
handbook, National Economy of the Ukraine (Narodnoe gospodarstvo Ukrainskoi SSR), 
also published in 1957. The user of data will generally prefer to go directly to the 
official handbook, since Holubnychy’s data are, in effect, estimates derived indirectly 
from fragmentary official statistics. The two sets of figures correspond closely in many 
cases, but diverge widely in others. For example, Holubnychy’s estimates for recent 
years are larger than the official data by 26 per cent for cement and by 90 per cent 
for locomotives; on the other hand, they are lower by 15 per cent for butter, by 23 
per cent for vegetable oil, by 60 per cent for meat, by 37 per cent for cotton fabrics, 
and by 52 per cent for woolen fabrics. 

The specialist will find it useful to consult Holubnychy’s data, since they cover 
more years than the official handbook. The two sources should, however, be used 
jointly, so that proper adjustments may be made on the basis of data for benchmark 
dates. When all is done, the user is still faced with what are probably the poorest 
and least reliable industrial statistics in the modern world. The blame for this does 
not lie with Holubnychy. 


High-Speed Data Processing. C. C. Gotlieb and J. N. P. Hume. New York: McGraw- 
Hill Book Company, 1958. Pp. xi, 338. $9.50. 


Harry E1senpress, International Business Machines Corp. 


—_ statistician who wants a general introduction to the field of electronic com- 
puters will find this book useful. It will be particularly valuable for obtaining 
some insight into the efficient handling of masses of data and large numbers of 
records, i.e., data processing, rather than scientific computing. Of course, a good part 
of the book is applicable also to scientific computing, since, as the authors point out 
in the first chapter, the two uses of computers have many points of similarity. 

The book is ambitious in its scope, for it attempts to cover machine description, 
preparation of machine instructions, and data processing, all in one volume of average 
size. The authors have succeeded in producing a fairly comprehensive volume, but 
many readers will probably find it necessary to dip into one of the references given 
in the bibliography for more complete information on any given topic. 

After a brief historical introduction on computers, the authors develop, in Chaps. 
2 to 4, the typical way in which an electronic computer moves a number (or record) 
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through its circuitry. Chaps. 5 to 7 present the types of instructions which accomplish 
arithmetic and input-output operations, and methods of combining these instruc- 
tions, or programming. The use of flow diagrams is illustrated, and programs are 
worked out for payroll calculation, mortgage amortization, income-tax calculation, 
inventory maintenance, and utility billing. Chaps. 8 and 9 discuss special program- 
ming problems, such as program checking, and optimization of coding to achieve 
minimal running time. Chaps. 10 to 12 outline methods of handling files of records, 
and apply these methods to insurance, accounting, and banking problems. The dis- 
cussion of sorting is especially interesting, and fairly complete. Chap. 13 extends the 
range of applications to problems of simulation, inventory control, and linear pro- 
gramming. Chap. 14 is a good summary of developments in automatic programming, 
including assembly routines, compilers, and interpreters. 

The programming chapters do not seem to have been designed for teaching the 
art of programming, but a set of problems at the end of the book will presumably 
accomplish this objective. It would have been better if this part of the text had been 
written in a less descriptive, and more pedagogic manner, since mose persons who 
are interested in the details of programming are also anxious to learn and use them. 

Some minor floaws may be mentioned: Some terms, such as file, marker, block, are 
defined rather late in the volume, after being used rather frequently on previous 
pages. Details of variations among machines with respect to their different types of 
instructions are a little overwhelming and unnecessary, as is also the discussion of 
the result of adding two alphabetic characters in various machines. 

The volume is, nevertheless, a welcome addition to the library of the statistician, 
whose computational problems can now be greatly eased by the electronic computer. 


Experimental Designs in Industry. Edited by Victor Chew. New York: John Wiley and 
Sons, 1958. Pp. xi, 268. $6.00. 


SterHen Harrison, Ingleside, Illinois 


His book is based upon a symposium held at North Carolina State College in 
November 1956, sponsored by the Mathematics Division, Air Force Office of 
Scientific Research, Air Research and Development Command. 

Whenever I hear that the proceedings of a symposium have been published in 
book form, I immediately become prejudiced against it, since such books are, in my 
experience, generally of uneven quality, and lacking in any real integration. I there- 
fore picked up this book with some misgivings; these, however, were quickly dis- 
pelled. I read it through, from cover to cover, and found it almost uniformly clear, 
instructive and interesting. Perhaps when one studies the list of 37 participants and 
contributors at the Symposium this becomes less surprising; it reads like the Statis- 
ticians’ “Who’s Who,” ranging all the way from Anderson to Youden. 

The coverage of Experimental Designs is broad and includes Complete Factorials, 
Fractional Factorials, Confounding, Split-Plot designs, Incomplete Blocks, and no 
less than two chapters are devoted to the exploration of response surfaces, (one of 
them, appropriately, by Box and Hunter). There is a separate chapter on Multiple 
Correlation. The first four chapters introduce the types of designs, and these are 
followed by five chapters describing industrial applications. There is also an extensive 
Bibliography. 

The value of the book is greatly enhanced by Chew’s Introductory Chapter, which 
offers a bird’s eye view of the whole subject. It is very succinct and very much to 
the point; it is also diplomatic. Chew points out that Experimental Design theory 
grew up in the province of agriculture, and has required modifications and extensions 
to make it fully useful in engineering and technology. This is because “Industrial 
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Engineers have the advantage over the Agricultural or Biological Sciences in that 
their experimental units are often more homogeneous, their readings more precise 
and their experimental conditions more reproducible.” Also their experiments often 
take less time to run, thus inviting a more sequential approach, which has been ex- 
ploited by Box and others. 

Although it is true that the error variance, or “noise,” is often less in industrial 
situations, I think it is only fair to point out that experimenters in this field have 
their own special “noise” problems. In particular, one has the problem of what might 
be termed “secular noise” if this is not a contradiction in terms. For example, halfway 
through one experiment I was running, the foreman suddenly got the notion that the 
results of the experiment would be used for rate-fixing. As a result, the personnel 
operating the process started to behave in a very different fashion in the way they 
they did their jobs. On another occasion, the manager of the plant became afraid 
that, among other things, the results of the experiment would give a good unbiased 
estimate of the overall yield efficiency of the operation (he was right). He therefore 
gave secret orders for a couple of barrels of tailings to be added back to the process. 
In neither case did this lead to any biasing of the estimates of the effects of the varia- 
bles under study, but we were left holding our heads wondering why the error es- 
timates had grown so large. Sometimes, however, treatment effect estimates may 
become biased. On one occasion, the personnel operating the process were reshuffled 
in such a way as to confound some of the main effects and interactions. This was not 
done with any skulduggerous intent on the part of management but arose through 
unavoidable absenteeism. Considerations like the above constitute, in my experience, 
an important practical difference between designing and running experiments in 
science and those in industry. 

Looking through this book as a whole, one gets the impression that, compared 
with “classical” experimental design theory, there is perhaps somewhat less emphasis 
on probability theory and more emphasis on the subtlety of the design in giving in- 
sight into the functional relationships between variables. No doubt this is partly due 
to the smaller error variances, as noted by Chew, which may also be responsible for 
the ironical result, reflected in many chapters of this book, that the payoff in terms 
of increased efficiency through use of modern experimental design seems to be much 
greater in the fields of engineering and technology than in that of agriculture which 
gave it birth. In an issue of Biometrics some years back, Gertrude Cox, who, by the 
way, also contributes a foreword to this book, reported a gain of efficiency of 23% 
relative to what would have been obtained if simple randomized blocks had been used. 
Her estimate was based upon an examination of a large number of agricultural ex- 
periments covering many design types. It seems to me that this is a rather modest 
gain, especially when one must include on the debit side the extra skilled man-hours 
required in planning the experiments and in analysing the data. In contrast, gains in 
efficiency in the engineering field are often ten-fold or more. This only increases one’s 
sense of indebtedness to the agricultural sciences, since one feels that they did most 
of the hard work and the rest of us are getting most of the benefit. 

Chew points out in the opening chapter that Experimentation is an art as well as 
a science, and the kind of statistical tool which one uses depends upon one’s prior 
knowledge of the nature of the process. I think it goes deeper than this; the approach 
adopted is also very much a function of the personality of the experimenter. For 
example, though all agree that randomization is essential, there are large differences 
of opinion as to how thorough this should be. Again, some experimenters like to push 
their luck more than others in the prior assumptions they are prepared to make about 
the nature of the process. Also, some prefer the large, well-thought-out compre- 
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hensive experiment; others prefer a more sequential approach, utilizing information 
that turns up along the way. This last difference in strategy is somewhat analogous 
to that between Generals Montgomery and Patton, either of whom, we are assured, 
could have won the war by Christmas, given a free hand. I suspect that the differ- 
ences between the styles of experimenters often depend upon whether they approach 
experimental design through the avenue of mathematics or that of research in the 
natural sciences. One thing that makes the application of experimental design such a 
lively and interesting subject is just this fact that it lies on the borderline between 
science and mathematics; it is an area where mathematicians and scientists can meet, 
and in a spirit of friendly camaraderie, explain just exactly what it is they dislike 
about each other. And from this exchange, everyone stands to gain. Both types have 
contributed to this book, neither getting the upper hand. 

The two chapters on response-surface-exploration designs will be particularly 
welcomed by non-mathematicians, who, like myself, were baffled by the mathematical 
technicalities of some of the earlier papers on the subject. Incidentally, the mathe- 
matics used throughout the book is kept simple, and is not stressed. 

There is a frustrating preface to the book which lists five papers that were not 
included. The absence of two of these is particularly to be deplored. One is titled 
“Evolutionary Operation: A Method for Increasing Industrial Productivity,” by 
G. E. P. Box. The other is “Where Do We Go From Here?” by John Tukey. 

As is customary, Type I and Type II errors are both mentioned at the outset, 
after which, Type II errors fade out of the picture. This is not, of course, the fault 
of the authors; it is a major lacuna in the present state of the Art. It seems to me 
that unless one has some definite notion of the cost of making a Type II error, relative 
to that of making a Type I error, and also some a priori notion of the probability of 
a Type II error, no given significance level can be shown to be preferable to any 
other, and one might as well use a table of random numbers. Surely it is time that 
the classical method of testing hypotheses were absorbed into some kind of compre- 
hensive decision functions procedure? Yet, so far as I am aware, this does not exist 
(at any rate for continuous statistics). Most of my colleagues do not agree with me, 
as I know to my cost. Every time I raise the subject I am made to feel like one men- 
tioning sex in a Victorian drawing room. 

The large size of the pages, (83” by 11”) seemed to make reading much easier. 
The whole book is reproduced by the offset process, but this does not seem to have 
adversely affected clearness or legibility and the publishers tell me that this is re- 
sponsible for the low price of the book. 


A Comprehensive Bibliography on Operations Research. Operations Research Group of 
Case Institute of Technology. New York: John Wiley & Sons, Inc., 1958. Pp. xi, 188. $6.50. 


H. M. Werneartner, University of Chicago 


HIS volume consists of approximately 3,000 titles of books and articles on opera- 
tions research published through the year 1956, with a supplement for 1957. The 
items are listed by author in alphabetical order, followed for each letter by a section 
of publications by no specified authors, and a section of items in foreign languages. 
Each entry is classified by a marginal ten-digit code word giving information on type 
of organization (industry), function or type of activity, technique, and other aspects. 
These code words were punched on cards, then sorted to obtain some forty special 
bibliographies printed in the book following the alphabetical entries. (These special 
bibliographies do not include the items from the year 1957.) 
The authors’ general rule on coverage was to include all articles for an operations 
research audience, but to avoid the development of a bibliography of statistics, mathe- 
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matics and economics. Some consideration was given to availability, but in any case, 
all articles published in the Journal of the Operations Research Society of America, 
the Operational Research Quarterly, Management Science and the Naval Research 
Logistics Quarterly were included. In compiling the special bibliographies the choice 
of classifications was admittedly arbitrary, the general criterion having been apparent 
interest and the avoidance of lengthy lists. The punched cards themselves are avail- 
able from the Operations Research Society at $35 a set. 

In a rapidly growing field such as Operations Research a compilation of articles is 
considerably out of date at publication time. Much of the published work in this 
field represents advances and refinements in the approaches to problems that have 
been posed before, or in techniques that have been developed before. Thus there is a 
considerable continuity provided by the references in the current literature. At the 
same time, with the current long lead-times in the publication of articles in journals 
and for other reasons, many of the developments appear in research memoranda of 
limited circulation, and frequently never appear in other form. Such considerations 
put a damper on the enthusiasm with which one might approach this volume. 

Unfortunately, the book itself does not overcome these limitations by the quality 
of its execution. An alarmingly high proportion of articles is coded “General” or 
“Not Applicable” in all of its class codes. A somewhat closer look reveals significant 
gaps in areas ostensibly completely covered or nearly so; the omission of research 
memoranda of wider accessibility and greater interest (in this reviewer’s opinion) 
than much of the material listed; and the inclusion of some items one could never 
hope to locate.! One could also question in the rule which leads to the exclusion of 
such books as William Feller’s An Introduction to Probability and Its Applications 
and the inclusion of John Dewey’s Logic, The Theory of Inquiry. Finally, the topical 
bibliographies are marred by errors due to incorrect coding and sorting, which would 
lead to similar errors in the ingenious “do-it-yourself” bibliographies. However, it is 
refreshing to note that Stephen Potter’s The Theory and Practice of Gamesmanship is 
there with the coding “book, game theory, other.” 


NEL Reliability Bibliography—Supplement 1. Naval Electronics Laboratory. Washington, 
D. C.: U.S. Dept. of Commerce, Office of Technical Services. Pp. unnumbered. $3.00. 


JupDAH RosENBLATT, Purdue University 


jew need for a bibliography in a field of applied statistics as important as reliability 
is apparent to anyone who has ever tried to determine what has been accomplished 
in this field. For such a bibliography to be useful it must (1) be arranged in ho- 
mogeneous, meaningful categories; and (2) contain relatively complete abstracts of 
the articles reviewed. In particular, with regard to the first criterion, articles present- 
ing new results should be separated from run-of-the-mill summaries and reviews. 
Unfortunately this reviewer feels that the NEL Reliability Bibliography is seriously 
deficient in both of the respects mentioned above. The NEL Reliability Bibliography 
used a breakdown into the following categories: Circuit Design; Components; Elec- 
tron Tubes; Failure Analysis; General; Human Engineering; Maintenance; Mechan- 
ical Design; Systems; Testing. It would seem that a category-breakdown of the fol- 
lowing kind would be more useful: 


1. Theoretical Models 
a. Components b. Systems 
1) Capacitors 1) Circuit Design 
2) Electron Tubes 2) Mechanical Design 





1 For instance, on p. 59, we find “Gamow, G. I., Certain aspects of battle theory, Technical Memorandum No. 
230, Aug. 1953.” 
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2. Empirical Studies 
a. Quantitative Studies 
1) Components 2) Systems 
a) Capacitors a) Failure Analysis 
b) Electron Tubes b) Particular Systems 
c) Results using specific design 
criteria 
b. Qualitative Studies 3) Human Engineering 
3. Summaries, Reviews, and Articles on Philosophy 


Several of the summaries are incomplete; for example, in 4-10, “Failure of Complex 
Equipment,” the purpose of the study is given, but no hint is apparent as to whether 
there were any results. 

Due to the redundancy in the categories of this bibliography, the article “Factors 
Influencing Capacitor Reliability” is in the “General” section, but not in the “Com- 
ponents” section, while the article “Diagnostic Program for the Illiac” is in the 
“Failure Analysis” section only, and “Trouble Shooting Techniques” is only found 
in the “Maintenance” section. 

In summary, this reviewer feels that a more orderly group of categories, and in 
places a more complete summary of certain articles, are necessary. Without these 
modifications, the NEL Reliability Bibliography is of very little use to one who needs 
information on Reliability. 


Linear Programming and Associated Techniques, A Comprehensive Bibliography. Vera 
Riley and Saul J. Gass. Baltimore, Maryland: The Johns Hopkins Press, 1958. Pp. x, 613. 
$6.00. 


Harvey M. Waaner, Stanford University 


EVIEWING a “comprehensive bibliography” is not unlike reviewing the telephone 

book (including the “yellow” pages), about which it has been said that the plot 
is dull but the cast of characters staggering. So it is with the Riley-Gass compilation— 
including the paper binding. The authors have collated more than 1000 references 
dated earlier than June, 1957 related to the topics of linear, nonlinear, and dynamic 
programming. The story line, such as may be discerned, starts with a few introductory 
remarks on the mathematical model of linear programming and an outline of the 
revised simplex method, and then launches swiftly into the classifications General 
Theory (Mathematical Theory, Computational Techniques, ..., Game Theory), 
Applications (General Survey, Industrial Applications, Transportation Problems), 
Nonlinear and Dynamic Programming, with a complete Author Index for the re- 
capitulation. 

A precis appears for each referenced book, article, technical report, discussion 
paper, thesis, chapter, conference abstract, etc., often taken from the source itself. 
As the authors note, “Although different styles and technical levels of abstracting 
may have thrown the composite bibliography somewhat out-of-focus, the com- 
pensating gain in time has made this service invaluable.” The reviewer wholeheart- 
edly approves of the authors’ decision to use their available resources in such a 
manner as to minimize the titae from compilation to publication. But it is only 
fair to warn readers that, as anticipated, certain aberrations do prove to be dis- 
concerting, for instance, the sporadic obituary notices, or more correctly, lack of 
them (e.g., J. C. C. McKinsey, A. Wald), and the sometimes inane occupational 
title accompanying an institutional affiliation (e.g., Private, U.S. Army). The book 
is more of an historical piece than was probably intended—the bibliography in citing 
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institutional affiliations as of June, 1957, provides a noteworthy source of data for a 
statistical study of job mobility in the “operations research” profession. 

Despite some misciting, which appears to have been kept to a just noticeable level, 
some palpable oversighting (e.g., R. G. D. Allen, Mathematical Economics; Z. Go- 
zinto, “An Output Matrix for Production Control”), perhaps a pre-sputnik an- 
tipathy leading the authors to miss citing important and fundamental contributions 
from the Eastern Bloc (e.g., L. V. Kantorovitch’s 1939 paper “Matematischeskie 
metody organizatii i planirovania proizvodstva”), and a measure of overciting due 
to the repeating of an entire reference, abstract and all, each time it is classified, this 
bibliography is worthy of being placed permanently on the shelf of any statistician 
who has a strong interest in mathematical programming. 


Efficiency in Government Through Systems Analysis, With Special Emphasis on Water 
Resources Development. Roland N. McKean, New York: John Wiley and Sons, Inc., 
1958. Pp. x, 336. $8.00. 


Joun V. Krutitua, Resources for the Future 


Fy feieney in Government Through Systems Analysis, although not addressed to 
statisticians, is an excellent discussion of the nature of applied analysis, and the 
range of problems for analysis associated with governmental operations. 

The study is organized into four substantive parts that follow a short introductory 
Part I. Part II is a fairly extensive discussion of the nature of applied analysis. Here 
in some seventy pages McKean distills the analytic lore accumulated at RAND, 
and elsewhere by fellow operations researchers. His chapters on the nature of cri- 
teria, the selection for analysis of relevant alternative courses of action and the ele- 
ments of uncertainty and intangibles in problems of analysis focus attention on 
deliberate thought-taking in approaching problems. Here is treated in systematic, 
non-technical fashion, a range of significant questions which have received but scat- 
tered treatment elsewhere in the literature, despite the fact that they are faced daily 
(although often not explicitly) in applied analysis. For one on the outside, or only 
on the fringe, of the operations research fraternity, these chapters will provide in- 
structive reading. Moreover they represent a worthwhile review for all who are 
interested in the area. The final chapter of this part is a systematic general discussion 
of investment criteria under conditions of varying degrees and circumstances of 
capital rationing. This, taken in conjunction with the subsequent discussion of special 
problems in analysis of water resource projects, provides an extensive discussion of 
investment criteria for water resource development programs. 

Most of Part III is devoted to a critical analysis of benefit estimation problems in 
an area where non-marketable project outputs predominate. The discussion of 
“spillovers” (external economies and diseconomies) distinguished neatly between 
physical and economic interdependence with a systematic treatment of the classes 
of pecuniary spillovers and the circumstances which give rise to them. The discussion 
of “secondary benefits” treated against this background of classes of pecuniary spill- 
overs, aids in illustrating convincingly the dangers of double counting. Here some of 
the most vexing problems in valuation of benefits in public resource development 
programs are handled with great finesse. I find reason to take exception in on!y two 
particulars; see below. 

Part IV consists of two single-chapter case studies, the Green River Watershed 
and the Santa Maria Project. On the whole this is a useful complement to the 
methodological sections. I would have preferred cases which were of greater intrinsic 
interest, yet the ones selected serve well the purpose of illustrating applications of 
systems analysis. 
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a. Quantitative Studies 
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b) Electron Tubes b) Particular Systems 
ce) Results using specific design 
criteria 
b. Qualitative Studies 3) Human Engineering 
3. Summaries, Reviews, and Articles on Philosophy 


Several of the summaries are incomplete; for example, in 4-10, “Failure of Complex 
Equipment,” the purpose of the study is given, but no hint is apparent as to whether 
there were any results. 

Due to the redundancy in the categories of this bibliography, the article “Factors 
Influencing Capacitor Reliability” is in the “General” section, but not in the “Com- 
ponents” section, while the article “Diagnostic Program for the Illiac” is in the 
“Failure Analysis” section only, and “Trouble Shooting Techniques” is only found 
in the “Maintenance” section. 

In summary, this reviewer feels that a more orderly group of categories, and in 
places a more complete summary of certain articles, are necessary. Without these 
modifications, the NEL Reliability Bibliography is of very little use to one who needs 
information on Reliability. 


Linear Programming and Associated Techniques, A Comprehensive Bibliography. Vera 
Riley and Saul I. Gass. Baltimore, Maryland: The Johns Hopkins Press, 1958. Pp. x, 613. 
$6.00. 


Harvey M. Waaner, Stanford University 


EVIEWING a “comprehensive bibliography” is not unlike reviewing the telephone 

book (including the “yellow” pages), about which it has been said that the plot 
is dull but the cast of characters staggering. So it is with the Riley-Gass compilation— 
including the paper binding. The authors have collated more than 1000 references 
dated earlier than June, 1957 related to the topics of linear, nonlinear, and dynamic 
programming. The story line, such as may be discerned, starts with a few introductory 
remarks on the mathematical model of linear programming and an outline of the 
revised simplex method, and then launches swiftly into the classifications General 
Theory (Mathematical Theory, Computational Techniques, ..., Game Theory), 
Applications (General Survey, Industrial Applications, Transportation Problems), 
Nonlinear and Dynamic Programming, with a complete Author Index for the re- 
capitulation. 

A precis appears for each referenced book, article, technical report, discussion 
paper, thesis, chapter, conference abstract, etc., often taken from the source itself. 
As the authors note, “Although different styles and technical levels of abstracting 
may have thrown the composite bibliography somewhat out-of-focus, the com- 
pensating gain in time has made this service invaluable.” The reviewer wholeheart- 
edly approves of the authors’ decision to use their available resources in such a 
manner as to minimize the time from compilation to publication. But it is only 
fair to warn readers that, as anticipated, certain aberrations do prove to be dis- 
concerting, for instance, the sporadic obituary notices, or more correctly, lack of 
them (e.g., J. C. C. McKinsey, A. Wald), and the sometimes inane occupational 
title accompanying an institutional affiliation (e.g., Private, U. S. Army). The book 
is more of an historical piece than was probably intended—the bibliography in citing 
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institutional affiliations as of June, 1957, provides a noteworthy source of data for a 
statistical study of job mobility in the “operations research” profession. 

Despite some misciting, which appears to have been kept to a just noticeable level, 
some palpable oversighting (e.g., R. G. D. Allen, Mathematical Economics; Z. Go- 
zinto, “An Output Matrix for Production Control”), perhaps a pre-sputnik an- 
tipathy leading the authors to miss citing important and fundamental contributions 
from the Eastern Bloc (e.g., L. V. Kantorovitch’s 1939 paper “Matematischeskie 
metody organizatii i planirovania proizvodstva”), and a measure of overciting due 
to the repeating of an entire reference, abstract and all, each time it is classified, this 
bibliography is worthy of being placed permanently on the shelf of any statistician 
who has a strong interest in mathematical programming. 


Efficiency in Government Through Systems Analysis, With Special Emphasis on Water 
Resources Development. Roland N. McKean, New York: John Wiley and Sons, Inc., 
1958. Pp. x, 336. $8.00. 


Joun V. Krutitua, Resources for the Future 


ficiency in Government Through Systems Analysis, although not addressed to 
statisticians, is an excellent discussion of the nature of applied analysis, and the 
range of problems for analysis associated with governmental operations. 

The study is organized into four substantive parts that follow a short introductory 
Part I. Part II is a fairly extensive discussion of the nature of applied analysis. Here 
in some seventy pages McKean distills the analytic lore accumulated at RAND, 
and elsewhere by fellow operations researchers. His chapters on the nature of cri- 
teria, the selection for analysis of relevant alternative courses of action and the ele- 
ments of uncertainty and intangibles in problems of analysis focus attention on 
deliberate thought-taking in approaching problems. Here is treated in systematic, 
non-technical fashion, a range of significant questions which have received but scat- 
tered treatment elsewhere in the literature, despite the fact that they are faced daily 
(although often not explicitly) in applied analysis. For one on the outside, or only 
on the fringe, of the operations research fraternity, these chapters will provide in- 
structive reading. Moreover they represent a worthwhile review for all who are 
interested in the area. The final chapter of this part is a systematic general discussion 
of investment criteria under conditions of varying degrees and circumstances of 
capital rationing. This, taken in conjunction with the subsequent discussion of special 
problems in analysis of water resource projects, provides an extensive discussion of 
investment criteria for water resource development programs. 

Most of Part III is devoted to a critical analysis of benefit estimation problems in 
an area where non-marketable project outputs predominate. The discussion of 
“spillovers” (external economies and diseconomies) distinguished neatly between 
physical and economic interdependence with a systematic treatment of the classes 
of pecuniary spillovers and the circumstances which give rise to them. The discussion 
of “secondary benefits” treated against this background of classes of pecuniary spill- 
overs, aids in illustrating convincingly the dangers of double counting. Here some of 
the most vexing problems in valuation of benefits in public resource development 
programs are handled with great finesse. I find reason to take exception in only two 
particulars; see below. 

Part IV consists of two single-chapter case studies, the Green River Watershed 
and the Santa Maria Project. On the whole this is a useful complement to the 
methodological sections. I would have preferred cases which were of greater intrinsic 
interest, yet the ones selected serve well the purpose of illustrating applications of 
systems analysis. 
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The study would stand as an integral piece with the inclusion of only the first 
four parts. Part V, on other applications of analysis to increase government efficiency, 
however, is thrown in for good measure. I have mixed feelings about its inclusion. 
Its two chapters, “Analysis for Performance Budgets: An Illustration” and “Ana- 
lytical Aids to Government Economy: A Survey of Opportunities,” detract from the 
unity and focus of the main stream of the study. Here the aim appears to have 
been shifted from analysts as the intended beneficiaries to administrators who may 
be persuaded of the value of analysis. While either of these chapters would make 
acceptable papers or memoranda, they suffer by comparison with the clearly superior 
quality of the remainder of the study. 

When McKean deals with general problems of analysis, his work is distinguished 
by its systematic treatment, comprehensive coverage of special cases, and skill in 
exposition. It is enlivened by a spritely style with many delightful examples to 
illustrate concretely the somewhat elusive distinguishing characteristics of “sys- 
tems analysis.” This portion of the study is beyond question the product of a gifted, 
sophisticated analyst. But his application to actual problems suffers on occasion from 
a tendency to assume inappropriately that the public area is an analog of the private 
sector, and by insufficient recognition of critical institutional considerations. I men- 
tion two examples. 

The first involves his recommendation for treatment of taxes when benefit cost 
analysis is used to compare public and private ventures in the water resources field. 
He recognizes that a large proportion of taxes are simply income transfers rather 
than real costs, and thus would not include them on the cost side when comparing 
two public projects. But when comparing a proposed public venture with a competing 
private alternative, as in the case of Hells Canyon, ke would ignore this on the 
grounds that for the sake of comparability, the public project ought to be charged 
with “Federal, State and Local taxes which would be levied were there no exemp- 
tions” (p. 165). As a practical matter, state and local taxes can be regarded as related 
to incremental costs of public services required in connection with a new develop- 
ment, and appear eligible for inclusion as real costs under any circumstances. But 
there is no comparable justification for inclusion of federal corporate taxes. It is more 
appropriate to eliminate such tax levies from the cost side of the analysis of private 
ventures, in comparison with public ventures, instead of adding them in the analysis 
of public ventures, since they represent transfers in any event. 

Secondly, while McKean’s discussion of investment criteria under various condi- 
tions covers the range, and is correct and skillfully done, his choice of a criterion for 
actual application is not consistent with some fundamental attributes of the Ameri- 
ean scene, e.g. the constitutional division of powers among the central and local units 
of government, and to a lesser extent also the characteristic legislative and budgetary 
procedures in the federal government. 

McKean is dissatisfied with the conventional criterion of justification, the benefit- 
eost ratio, as it represents a temptation to rank projects without regard to absolute 
size of gains and costs. But of greater concern to McKean, such rankings would tend 
to maximize benefits with respect to total expenditures, whereas optimization requires 
maximizing with respect to the limitational factor. In his view it is investment capital 
which is the scarce resource, and the marginal internal rate of return is the correct 
eriterion under conditions of budgetary restrictions. He correctly represents invest- 
ment funds as “resources which must be brought to these ventures from outside 
because the resources are not made available up to that time by income from the 
ventures themselves” (p. 114). Actually, services provided by water resource develop- 
ment programs are in large part non-revenue producing, and where marketable, 
revenues often do not accrue to the agency whose budget finances the development, 
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hence, do not relax the effectiveness of the budgetary constraint. To reconcile this 
with the internal rate of reiurn criterion he argues for adopting a broad view of the 
matter in which “Benefits to whomsoever they may accrue are treated as receipts, 
and the costs to be covered by this budget are outlays not matched by such receipts” 
(p. 115). He alleges that ranking by the marginal internal rate of return will be the 
correct criterion if we take “a fairly optimistic view of Congress’ approach to the 
problem, namely that Congress will deliberate about the allocation of an over-all 
water-resource budget” (p. 116). He recognizes that if (a) Congress allocates each 
agency’s budget separately, or (b) the net benefits are not actually reinvested at the 
marginal internal rate of return, the criterion has no applicability (pp. 85, 91, fn. 
10 and 116). Yet, in spite of these admissions, this very criterion is proposed as the 
correct one in his analysis of the Green River Watershed (p. 212). Here, federal, 
non-federal public and private parties are engaged in a cooperative program of inter- 
dependent investment. Since there is no institutional machinery to effect a transfer 
of funds among budgets of participating parties so as to equalize the social marginal 
productivity of capital from each budget, an internal rate of return for a synthetic 
“national” water resource development budget does not provide a realistic optimiz- 
ing guide. Since the study purports to be a practical guide to economic efficiency in 
the public sector, it must take the federal system as given and recognize that Congress 
has no power to deliberate over state, local and private components of a “national” 
water resource budget. 

Realistic optimizing guides under these circumstances will recognize several sepa- 
rate budgets, with doubtless different probabilities for reinvestment of net yields 
and perhaps different degrees of budgetary tightness. While abstracting from insti- 
tutional reality will provide a criterion for an idealized sub-optimum (a maximum 
subject to only one institutional blemish, a capital ration) it neglects an excellent 
opportunity to illustrate systems analysis in a most productive practical role. Here 
criteria could be illustrated for use in maximizing subjects to various institutional 
and policy side conditions to achieve realistic “second best” (or to borrow from Sam- 
uelson, “feasible first-best”) solutions. I believe that in this respect McKean’s ap- 
proach is less flexible and actually inferior to existing practices of federal agencies 
when confronted with cooperative ventures for interdependent investment programs. 

If systems analysis is not in part an exhortation to be better informed and more 
intelligent than we can hope to be, it represents at best only an imperfect substitute 
for omniscience. Thus, the criticisms above may be akin to beating a good wife for 
not being better. Efficiency in Government Through Systems Analysis remains, despite 
these points, an excellent book that will be required reading to pass the threshold 
into analytic grace, and perhaps an occasional re-reading as an aid to retaining that 
state. 


Statistisches Jahrbuch fiir die Bundesrepublik Deutschland, 1958. Wiesbaden: Statis- 
tisches Bundesamt, 1958. Pp. xxiv, 764. DM 28.00. 


Statistisches Taschenbuch, Pocket-Book of Statistics, Annuaire Statistique de Poche 
1958. Bundesrepublik Deutschland. Wiesbaden: Statistisches Bundesamt, 1958. Pp. 264, 
2 maps. DM 6.80. 


Jerry W. Comes, Jr., U. S. Bureau of the Census 


7 is the seventh year of publication for Statistisches Jahrbuch, but only the first 
year of publication of the trilingual edition of the Pocket-Book of Statistics. They 
are reviewed together because, with exceptions noted later, the latter covers in 
abbreviated form the same ground as does the former and much larger book. 

The general format of the Federal Republic’s yearbook has changed little in the 
seven years of publication. The 1958 edition consists of 24 chapters on various phases 
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of the size, growth, and characteristics of the population, and of the political, social, 
and economic activity within the country. Data for the Federal Republic are supple- 
mented by sections dealing with the Soviet Zone of Occupation of Germany (Sow- 
jetische Besatzungszone), the eastern territories (Ostgebiete des Deutschen Reiches 
z.Z. unter fremder Verwaltung), and by 176 pages of international statistics. Data 
relating to the Soviet Zone of Germany now have less utility than formerly because 
the German Democratic Republic has published its own yearbooks since 1955, and 
the Federal Republic statisticians now rely principally on them. Some effort has been 
made, however, to render the Soviet Zone data in categories comparable to Federal 
Republic practice. Information relating to the “eastern territories” is limited. The 
section on international statistics will have small appeal to the English-speaking 
audience since most of the information included is already available in English. 

With reference to data concerning the Federal Republic, however, it must be con- 
sidered an excellent statistical compendium. The current volume contains consider- 
able information not included in earlier years, among which are the results of the 1957 
Federal Diet elections, 1956/57 population statistics, the 1956 census of trades 
(handicrafts), and data on wages, income, and corporation taxes for 1954 and 1955. 
National income statistics have been presented in this volume in constant as well as 
actual prices, whereas in the 1957 volume only the latter data were shown. Despite 
these additions, the size of the book has actually been reduced by compressing ma- 
terial previously published. A major weakness is that much of the data relating to 
the Saar, incorporated into the Federal Republic on January 1, 1957, cannot be fully 
integrated with data for the rest of the country and is presented separately in many 
chapters. Differences in classification have not yet been adjusted. 

The first edition of the trilingual Pocket-Book of Statistics should fulfill its editors’ 
hope “that this booklet will facilitate for users abroad access to concepts and defini- 
tions in German statistics and help to inform a wider public abroad on statistical 
facts and figures in Germany.” It is an attractive book, literally pocket-size, whose 
format consists of pages of German text and tables facing pages containing English 
and French titles, headings, notes, etc. Corresponding to the book’s size, type is 
small, but this reviewer found the book exceedingly easy to use. The English transla- 
tion, by Erika Noering, is excellent. 

The Pocket-Book is organized in almost precisely the same manner as the larger 
Statistisches Jahrbuch, and duplicates all but three chapter headings of the larger 
book. Two chapters of the Jahrbuch have been omitted; two others have been com- 
bined. Considerably greater reduction has been achieved in chapter content through 
reducing the number of tables, compressing or omitting footnotes, and restricting 
data to major regions of the Federal Republic and to the major categories of each 
subject. While no one with knowledge of German will wish to restrict himself to the 
smaller volume, it will serve even for such a student as a useful and handy reference 
volume. Departure from strict replication of contents of the larger volume is occa- 
sionally noted. For example, the population by age in the Jahrbuch is for December 
31, 1956, but in the Pocket-Book age data are given for December 31, 1955. 

Probably the most interesting information for many in either book concerns the 
revised population statistics for recent years. As a result of a census of housing taken 
on September 25, 1956, population data for the Federal Republic from 1947 through 
1956 have been revised retrospectively. Population figures for each year 1952 through 
1956 have been reduced by more than 800,000, and smaller reductions have been 
made for earlier years. The 1950 census figure was lowered by some 600,000. It is 
regretted that no explanation for this revision, other than its basis in the 1956 hous- 
ing census, is offered in either of these books. Nor have data for the affected years, 
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other than total figures, been revised. Other sources, however, indicate that the re- 
vision was made because the Federal Statistical Office became convinced that the 
1956 census was more reliable than either the 1950 census or the migration data for 
the period 1947 through 1951. Vital statistics have not been questioned. Double- 
counting apparently explains the 600,000 error in the 1950 census. 


Demographic Yearbook, 1957 (in English and French). Statistical Office of the United 
Nations, Department of Economic and Social Affairs. New York: Columbia University 
Press, 1958. Pp. viii, 656. Cloth $8.00, Paper $6.50. 


Warren 8S. Toompson, Miami University 


xcept for 1949-50 this Yearbook has been published annually beginning with an 
E issue for 1948. As the title denotes its content is confined chiefly to the presenta- 
tion of data relating total population and its composition, to natality and mortality 
and to other data closely related to movements of population in all parts of the world 
from which such data can be secured. Many estimates are included where precise 
data are lacking. A chapter devoted to “Technical Notes on the Statistical Tables” 
evaluates the data quite objectively. 

With the establishment of the Yearbook it was decided to give special emphasis 
each year to some aspect of demography. Pursuant to this policy the volume being 
considered here contains an unusual amount and variety of material relating to mor- 
tality. Comparisons are made with earlier periods as regards changes in mortality as 
a whole, to mortality by age and sex and some information is given on the changes in 
the relative importance of several of the more significant causes of death. A very in- 
teresting and useful series of Life Tables is also included, the data for the entire 
period 1900-1950 being given where available. 

This particular volume also contains some new tables (5) on Migration not given 
hitherto. This Yearbook is indispensable to the student of population, but it should 
also be of interest to the layman who merely wants to know in a general way what 
demographic changes are taking place, and the significance of those changes. He will 
find here the data for the world as a whole as well as for the different countries. 

It is a highly competent piece of work merely as a Yearbook, and the special treat- 
ment given to different aspects of demography from year to year provide an insight 
into the dynamics of population change based on the latest available data which 
cannot so easily be obtained elsewhere. 


Year Book of Labour Statistics, 1957. Geneva: International Labour Office, 1957. Pp. xvi, 
535. $5.00 paper. $6.00 cloth. 


Davip L. Kapuan, Bureau of the Census 


HIS compendium needs no introduction to most statisticians and economists. It 
Tis a prime reference source and serves its users well. It has, of course, limitations 
in content—additional subjects and cross-classifications, additional detail within 
the covered subjects, additional information on national definitions and concepts 
would all be helpful. But a halt has to be called somewhere and 500-plus pages of 
clearly-presented information constitutes a major contribution. (Anyone who has 
ever tried to set up a table of reasonably comparable statistics for a number of 
countries appreciates the difficulty of the ILO’s task.) However, a reviewer can won- 
der whether the [LO’s publication system makes the most effective use of the avail- 
able resources. At present, each issue becomes virtually obsolete as its successor is 
published. The system used for the UN’s Demographic Yearbook makes each issue a 
supplement to, rather than a replacement for, its immediate predecessor by having 
successive issues specialize in different subjects. 
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Compared with the 1956 edition, the current issue does not contain the special 
appendix on certain newly-available U.S.S.R. data, nor the table on social security 
receipts and expenditures; but the other 38 tables remain virtually identical in format 
and subject content. There has, however, been a significant increase in the number of 
countries represented in most of the tables. A measure of this growth in coverage is 
the gain of more than 10 percent in pages for these 38 tables between the 1956 and 
1957 issues. 

The reviewer has two other comments. First, he looks forward to the day (perhaps 
after the 1960 censuses) when the ILO will feel that enough use is being made of its 
International Standard Classification of Occupations to warrant a separate table on 
the occupational structure of the economically active population, similar to the pres- 
ent Table 4, which focuses on industry. Second, he is still somewhat unnerved by the 
abandonment of the familiar chocolate-brown paper cover in favor of a modernistic 
orange and gray decor; such a massive changeover should probably have been ac- 
complished gradually over several years. 


The Older Population of the United States. Henry D. Sheldon, with introductory and 
summary chapters by Clark Tibbitts. New York: John Wiley & Sons, 1958. Pp. ix, 223. 
$6.00. 


Irvine L. Wesser, University of Florida 


His is another volume in the Census Monograph Series, prepared for the Social 

Science Research Council in cooperation with the U. S. Bureau of the Census. 
The work is a welcome addition to the rapidly expanding shelf of books dealing with 
social and economic aspects of aging. 

In the short introductory chapter Clark Tibbitts traces the brief history of the 
growth of interest in aging and the aged in this country and presents an essentially 
sociological interpretation of the present position of older people in American society. 
His ostensible purpose is to bring into focus the significance of study of the older 
population rather than to make a substantive contribution. Chaps. 2 through 8 deal 
in turn with the changing age structure; geographic distribution of the older popula- 
tion; age and employment; age and occupation; marital status, the family cycle, liv- 
ing arrangements and age; housing; and age and income. The final chapter, also writ- 
ten by Tibbitts, is a summary with comments. More than a third of the volume con- 
sists of seven appendixes corresponding in subject matter to the principal substan- 
tive chapters. 

Sheldon’s treatment of his material gives ample evidence of thorough familiarity 
with the nature of the data, awareness of crucial questions to be answered, and com- 
petence in using techniques of demographic analysis. He takes pains to set out in 
some detail the methodological problems which confront the student. What emerges 
is a carefully developed picture of the numbers, proportions, distribution, and selected 
characteristics of the older population (variously considered as persons 60 or 65 years 
of age and over) of the United States in 1950, with some trend data, mainly for the 
periods 1890-1950 and 1900-1950. Tibbitts’ summary and comment serves a useful 
function, particularly insofar as it points up the broader implications of Sheldon’s 
results and relates the findings to the role and status of the elderly in modern indus- 
trial society. The appendixes consist of comments on the relevant concepts and the 
nature and limitations of the data followed by detailed tables presenting the basic 
statistics. 

Special interest attaches to Chap. 4, in which the relationship of age and employ- 
ment is examined. Departing from the pattern followed elsewhere in the monograph 
of depending almost exclusively on data from the census and the Current Population 
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Survey, Sheldon draws on the findings of a number of special sample surveys in an 
effort to shed light on the role of disability, circumstances associated with retirement, 
including the part played by health, and the potentialities of the retired for further 
employment. Although the results of this analysis are by no means satisfactory or 
conclusive, they raise interesting questions and suggest promising directions for 
further study. 

The shortcomings of this monograph inhere for the most part in the scope and the 
available data. Its avowed purpose is to deal with the situation of the older popula- 
tion of the nation; hence it does not explore, and indeed tends to mask, many impor- 
tant geographical and group differentials. Mortality of the older population is con- 
sidered only in connection with trends in numbers and proportions and its effect on 
geographic distribution. The author has chosen to pursue his analysis without refer- 
ence, in most cases, to previous studies which also utilized census data. This practice 
is regrettable, especially in the case of phenomena such as internal migration, which 
has recently been given careful scrutiny by other demographers. Finally, the reviewer 
observed a minor lapse in connection with Sheldon’s effort to explain the high propor- 
tion of old people residing in Osceola County, Florida, in 1950 in terms of a “recently 
established . . . large scale community project for elderly retired persons” (p. 39). 
The county in that year had 21.6 per cent of its population in the 65-and-over bracket, 
but more than half of these lived in the town (3,001 inhabitants) of St. Cloud, estab- 
lished several decades ago. It was the 41.6 per cent of St. Cloud’s population in the 
aged category which largely explained Osceola County’s status. 

This monograph should be of interest and value to all those who are concerned with 
the social aspects of aging and the aged, and it should be of special value in stimulat- 
ing further demographic research in this hitherto neglected area. 


The Social Desirability Variable in Personality Assessment and Research. Allen L. 
Edwards. New York: Henry Holt & Co. (Dryden Press), 1957. Pp. viii, 108. $2.75. 


D. A. Sprorr, Waterloo College, Waterloo, Ontario 


HIs book is a summary of the author’s researches on the subject of social desira- 
bility started in 1952. His contention is that statements describing personality 
can be represented on a one-dimensional scale of social desirability. Chap. 2 compares 
the social desirability scale values derived from different sources and concludes that 
age and sex are not important. Chap. 3 considers the relationship between the social 
desirability scale value of a personality statement and the probability that this state- 
ment will be endorsed by subjects when applied to themselves. It is concluded that 
the probability of endorsement of an item in a personality inventory is positively and 
highly correlated with the social desirability scale value of the item. Chap. 4 is a 
description of a social desirability seale based on 79 items in the Minnesota Multi- 
phasic Personality Inventory (MMPI). Chap. 5 considers the scale values of items in 
various scales in the MMPI, using some results of factor analysis. Chap. 6 is a discus- 
sion of faking, or “the conscious distortion of scores in terms of response tendencies 
of the subject taking the inventories.” Chap. 7 is about the “forced-choice” inventory, 
that is, the procedure of asking subjects to choose between two or more statements. 
Chap. 8 considers social desirability and the Q technique, in which subjects are asked 
to describe themselves by sorting personality statements into a set of successive cate- 
gories, ranging from the least descriptive to the most descriptive. Finally, Chap. 9 pre- 
sents the implications for personality assessment and research. 
This book will not be of much interest to statisticians, and indeed, they will prob- 
ably find it difficult to follow in places, unless they are familiar with the subject. For 
instance, the whole procedure depends on the use of certain sealing methods, which 
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are not discussed in the book. Along that line, it might be of some interest to know 
how much the choice of a certain scaling method will influence the results. Also, the 
reviewer found it difficult to see where Chap. 7 was leading. 

The only statistical concept used in most of the book is the coefficient of correla- 
tion. In Chap. 8, mention is made of the factorial design. Chap. 5 uses some results 
of factor analysis. It is unfortunate that psychological literature shows no recognition 
of the fact that factor analysis, in which there are none of the a priori conditions char- 
acteristic of the designed experiment, is regarded with suspicion by many statisticians. 


Morphological Integration. Everett C. Olson and Robert L. Miller. Chicago: University of 
Chicago Press, 1958. Pp. xv, 317. $10.00. 


D. R. Cox, Birkbeck College, University of London 


5 ee book discusses the interpretation of systems of multiple measurements that 
arise in studies of the evolution of species and of the growth of individuals. Some 
idea of the methods used can be got from the following very much simplified account. 

Much of the work is with fossils. For each type, there are usually about 15-40 
specimens and on each specimen a large number of kinds of measurement, up to 50 
in some cases, are available. The authors’ central thesis is that such material can be 
properly interpreted only by a thoroughly multivariate approach. Two of the steps 
in their analysis are: 


(a) the construction of an index of morphological integration J,; 

(b) the use of “basic-pair” analysis. 

The index J, is determined as follows. Calculate the sample correlation coefficient 
between every pair of variables. If the variables z and y have a correlation numerically 
not significantly less than | p| , say that z and y are bonded at level p. When p=0 all 
pairs are bonded. As p increases, bonds will drop out. A p-group is a group of variables, 
such that any pair in the group are bonded at level p. (When p is large, the number of 
p-groups will fall until when p is near one, only one bond, and finally no bond, re- 
mains.) Lastly a non-contained group is a p-group not contained within another 
p-group. The “strength of intercorrelation” of the measurements at level p is meas- 
ured by B,, the number of bonds, and the way the correlations are distributed by 
B,/K,, where K, is the number of non-contained groups. Finally, these are combined 
into an index J,, varying between 0 and 1; 


4B, 


“° K,n*(n — 1)? 


The first steps in the authors’ analysis are usually a general assessment of the 
amount of integration, using J,, and then a study of patterns, using the p-groups 
formed at various values of p. Detailed comparisons are continued using “basic- 
pairs” (t.e., pairs of variables more highly correlated with each other than with other 
variables) and groups constructed using “basic-pairs.” 

A considerable number of examples are worked through in detail, with lengthy 
discussion of the biological interpretation of the index J, and of the form of the 
p-groups, especially when comparable fossils at different stages of evolution are 
examined. 

The interpretation is thus based on complex patterns existing among a large num- 
ber of correlation coefficients calculated from small samples. Workers in fields where 
fresh material can easily be obtained, and where the observations for study are fairly 
clearly defined and small in number, may well be sceptical of the whole approach. 
However the procedures are flexible and therefore may, when combined with the 
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authors’ detailed knowledge of the material, lead to worthwhile ideas. Conclusions 
based solely on the statistical analysis are presumably liable to be very shaky. The 
authors discuss fully the difficulties of their procedures. 

The original data are given in full, and it would be interesting to know the result 
of applying procedures, such as principal component analysis, that do not depend 
solely on comparisons of pairs. Note that in assessing differences between sets of 
observations, only changes in the within-group correlation matrix are considered. 
Information contained in changes of mean between groups is ignored. 

The book is on the whole clearly written. There is, however, an occasional tend- 
ency to express a simple idea in a high-sounding way. For instance, instead of say- 
ing that correlations between all possible pairs will be considered, the set of all meas- 
urements on the species is denoted by M. Then: 


“From M form a product space, Q, where elements are all possible pairs (X;, Xj). 
Then Q is mapped into A: where | pss is the image of (X;, X;); A is the closed interval 
[0, 1]; and |»;;| denotes the correlation coefficient between a pair of measures X; 
and X;.” 


One is reminded of the old adage: “when in doubt, blind ’em with science”. 


Table of Factors for One-Sided Tolerance Limits for a Normal Distribution. Donald B. 
Owen. Albuquerque, New Mexico: Sandia Corporation, Technical Information Division, 
1958. Pp. 131. No price given. Paper. 

This pamphlet is essentially an extensive empirical comparison of four methods of 
using the noncentral-t distribution, namely: (1) the Johnson and Welch table 
(Biometrika, 1939); (2) the Resnikoff and Lieberman table (Stanford University 
Press, 1957); (3) the Jennet and Welch approximation (Journal of the Royal Statistical 
Society, Supplement, 1939) and (4) an unpublished approximation of R. M. McClung 
(1955). 

These methods are compared by using each to compute the multiple of the sample 
standard deviation which should be added to the sample mean in order to obtain an 
upper tolerance limit, that is, a number below which at least a stated proportion of 
the population lies with a stated confidence coefficient. 

The resulting comparison of the four methods, via the multipliers of the standard 
deviation that they produce, is difficult to interpret meaningfully. For a meaningful 
interpretation, it would be desirable to know the error in the stated confidence coeffi- 
cient or in the stated proportion of the population covered; and this cannot be judged 
at all readily from the error in the multiplier of the standard deviation. 

The part of the pamphlet that may be most useful consists of pages 9-13. These 
show the appropriate multipliers of the standard deviation obtained from the Resni- 
koff-Lieberman table. They cover confidence coefficients 0.75, 0.90, 0.95, 0.99; pro- 
portions included 0.75, 0.85, 0.90, 0.935, 0.96, 0.975, 0.99, 0.996, 0.9975, and 0.999; 
and sample sizes from 3 to 25 and 30, 35, 40, 45, 50, and . 

W. A. W. 
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bly, analysis and presentation of data; also with EAM and electronic com- 
puting equipment; familiarity with statistical series of Government and 
business. Salary range: $11,000 to $13,000. 


MATHEMATICIANS 


Advanced degree level or equivalent in training and experience. Applicants 
should be skilled in numerical analysis or have some background in Opera- 
tions Research. Familiarity with stochastic models particularly desirable. 
Salary range: $9,000 to $13,000. 


COMPUTER PROGRAMMERS 


Experienced—to work on IBM 704 and 709—trajectories, information re- 
trieval, linear programming or applied mathematics preferred. A strong 
background in mathematics and/or knowledge of EAM equipment useful. 


ADVANCE YOUR CAREER—Join one of the nation's fastest growing 
computer-oriented research organizations. Fringe benefits include 
advanced study program, attractive profit-sharing and retirement 
plans. 

Send resume to: 


CORPORATION FOR Economic 
AND INDUSTRIAL RESEARCH 


1200 JEFFERSON DAVIS HIGHWAY, ARLINGTON 2, VA. 
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MATHEMATICIANS 


For analysis group of expanding Research & Develop- 
ment Laboratory. Principal fields of interest are weap- 
ons systems analysis, electromagnetic propagation, op- 
erations research, nuclear phenomena, probability and 
statistics. 


Several openings are available for Mathematicians with 
masters or doctorates in mathematics or physics, These 
openings require men with vision and initiative. 


STATISTICIANS 


M.S. or Ph.D. with 2 to 5 years’ experience. To conduct 
statistical investigations in broad area of applications: 


Weapons Systems Analysis Logistics 
Design of Experiments Reliability Prediction 


Our modern laboratory provides a professional work- 
ing atmosphere and the location in a quiet suburban 
area makes for pleasant living and working with easy 
access to the cultural and educational facilities of met- 
ropolitan New York and New Jersey. Liberal benefits 
include a tuition refund plan. 


All inquiries in confidence. Please send resume includ- 
ing salary requirements to A. A. Franklin. 





Vitra LABORATORIES 


DIVISION OF VITRO CORPORATION OF AMERICA 
200 Pleasant Valley Way, West Orange, New Jersey 
(Other laboratories located at Eglin AF Base, Fla. & Silver Spring, Md.) 
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tne IDEA 


Behind 
MARKET RESEARCH 
TABULATING 


PRELIMINARY 
PLANNING 


PRACTICAL 
PROCEDURE 


fore}, Te), | Leov-¥ tam 4.00] Of t- 3-11, 1C) 


STATISTICAL service on market research 
tabulating begins long before a 
button is pushed. 


You get preliminary assistance in resolving 
your ideas .. . in translating sound thinking 
into well-planned questionnaires for the most 
practical and economical processing. 


There is always a best way to handle any 
assignment and STATISTICAL can help you apply 
it through long experience in methods 

and procedures. 

The same careful approach is used in processing 
data to assure highest quality in market information. 
Strict controls are maintained every step of the way 
from editing and coding to finished report. 


And this professional service is available to you 
days, nights, week-ends—any time you need it. 


ft Write for details today 








S TAT I Ss T I C A L General Offices: 


TABULATING CORPORATION 53 West Jackson Blvd. 


Established 1933 - Michael R. Notaro, President Chicago 4, Illinois 


TABULATING + CALCULATING + TYPING Phone: HArrison 7-4500 
TEMPORARY OFFICE PERSONNEL 




















Chicago * New York * St.Louis * Newark * Cleveland ¢ Los Angeles 
Kansas City °* Milwaukee ¢* San Francisco 
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THREE NEW HOLT-DRYDEN TEXTBOOKS FOR 1959 





STATISTICS AS APPLIED TO ECONOMICS 
AND BUSINESS 


ROBERT H. WESSELL, University of Cincinnati 
EDWARD R. WILLETT, Northeastern University 


Stressing basic ideas rather than formulas and derivations, Statistics as Ap- 
plied to Economics and Business is designed for elementary statistics courses 
in Business Administration and Economics. This much-needed new book is 
non-mathematical and non-symbolic in both content and approach, including 
all material suitable for the student who has had a minimum of mathematical 


training. January 1959 


ELEMENTS OF MATHEMATICAL STATISTICS 
D. RANSOM WHITNEY, The Ohio State University 


An introduction to the concepts and methods of statistical inference, this text 
is aimed primarily at the undergraduate student who has completed a course 
in calculus. In it, Professor Whitney has taken the middle road between the 
text that gives only statistical techniques and the one that attempts complete 


mathematical rigor. May 1959 


A BASIC COURSE IN MATHEMATICAL STATISTICS 
MORRIS ZELDITCH, JR., Columbia University 


This combined text- and workbook, with examples and exercises drawn from 
contemporary sociological data, offers all the material necessary for a complete 
one-semester course in sociological statistics. The understanding of procedures 
—rather than mere skill in manipulation—is emphasized; and the student is 
shown how to choose the appropriate statistical method for a given problem. 

March 1959 





HENRY HOLT AND COMPANY 


383 Madison Avenue, New York 17, N. Y. 
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New 1cGRAW-HILL Books 


A PRIMER OF PROGRAMMING 
FOR DIGITAL COMPUTERS 
By Marshal H. Wrubel, Indiana University. Ready in April. 





An introductory text, designed for junior-senior courses for physical scientists, engineers, 
and all other students who have problems to solve on computers. The purpose of the book 
is to explain how to go about setting up a problem for a digital computer, how to test it, 
and how to make it available to others, The primer discusses procedures common to all 
digital electronic machines, but an actual machine—the IBM Type 650—is used for 


examples. 


PROBABILITY AND STATISTICS FOR 

BUSINESS DECISIONS 

An Introduction to Managerial Economics Under Uncertainty 
By Robert Schlaifer, Harvard University. Ready in April. 


A nonmathematical introduction to the logical analysis of practical business problems in 
which a decision must be reached under uncertainty. The analysis which it recommends is 
based on the modern theory of utility and what has come to be known as the “personal” 





definition of probability. Exercises are provided at the end of each chapter. 


HIGH-SPEED DATA 
PROCESSING 


By C. C. GOTLIEB and J. N. P. HUME, both 
of the University of Toronto. McGraw-Hill 
Series in Information Processing and Com- 
puters. 338 pages, $9.50 


A basic, comprehensive treatment of the im- 
portant principles and general techniques of 
processing data at high speeds. It shows how 
data processors work, how to use them, and 
what their advantages are. Coding and pro- 
gramming methods are included and ex- 
amples of typical applications of high-speed 
data processing are shown. 


APPLIED STATISTICS 
FOR ENGINEERS 


By WILLIAM VOLK, Hydrocarbon Research, 
Inc., Princeton, N.J. McGraw-Hill Series in 
Chemical Engineering. 354 pages, $9.50 


Provides enough background and examples 
of statistical problems to enable practicing 
engineers to apply statistical analysis to 
their data. Emphasizing applications and 
providing many illustrative examples, it 
deals with the treatment of engineering data 
for correlation, precision, and analysis of 
experimental factors. Each chapter is com- 
plete in itself. 


Send for copies on approval 





McGRAW-HILL BOOK COMPANY, INC. 


330 WEST 42no STREET, 


NEW YORK 36, N.Y. 
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A complete 
tabulating service 
Billing 

Sales Analysis 

Payrolls 


Pension Planning 


Market Research 


JOHN FELIX 


veeetetirs 


3 EAST S4TH STREET* NEW YORK 22, NEW YORK 


PLaza 1-2050 
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Announcing 
May 
publication 


of 


This book is written for students who wish a 
functional knowledge of statistics to help them ade- 
quately solve problems needing statistical analysis. 


The text not only relates how the various statis- 
tical techniques work, but also why the techniques 
are used and how their properties are derived. 
When interpreting statistical methods, the book 
continually stresses approaching them by intelligent 
scientific analyses. After stating a statistical method, 
the text always points out why such methods are 
advantageous. The basic theory of each method is 
first stated and then the various ways in which it 
may be used are given. Realistic, practical problems 
are given as illustrations of the theory in use. 


Publication—May 1959 525 pages 6 x 9 inches 
Probable price $7.50 list 


MODERN STATISTICAL METHODS: 
DESCRIPTIVE AND INDUCTIVE 


CONTENTS 


COLLEGE DEPARTMENT 





Palmer O. Johnson, University of Minnesota 
Robert W. B. Jackson, University of Toronto 


Development of Modern Statistical Methods 
Classification and Reduction of Univariate Data 
Bases of Statistical Reasoning 

Theory of Tests of Statistical Hypotheses 


Tests of Statistical Hypotheses Expressed in Terms 
of Proportions or Percentages 


Statistical Hypotheses Expressed in Terms of Fre- 
quencies 


Tests of Statistical Hypotheses Expressed in Terms 
of Means 


Tests of Statistical Hypotheses Expressed in Terms 
of Variances 


The Analysis of Variance 

Non-Parametric Tests of Statistical Hypotheses 
The Problem of Estimation 

Classification and Reduction of Bivariate Data 
Classification and Reduction of Multivariate Data 
Special Applications of Multivariate Analysis 
Design and Analysis of Statistical Investigations 


Rand McNally & Company 
P.O. Box 7600, Chicago 80, Illinois 
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Coming Spring 1959 - - - 


the first text in a new series entitled 


UNIVERSITY TEXTS IN THE MATHEMATICAL SCIENCES 


PROBABILITY AND STATISTICAL 
INFERENCE FOR ENGINEERS: A First Course 


by CYRUS DERMAN and MORTON KLEIN, 
Department of Industrial and Management Engineering, 
Columbia University 
This is the first attempt to put the necessary material on probability 
and statistical inference into a brief, practical handbook. This new 
text will be of great value and use in engineering statistics courses 
and to all those whose work requires the ready availability of such a 
manual. 





approximately 168 pages illustrated tentatively $3.75 


Now in its second successful year— 


MEASUREMENT AND STATISTICS: 


A Basic Text Emphasizing Behavioral Science Applications 


by VIRGINIA L. SENDERS, Department of Psychology, 
University of Minnesota 


Since its publication last spring this new text has been received with 
extraordinary enthusiasm in colleges throughout the country. Class- 
room orders continue to come in, as do comments such as the follow- 
ing: 
“The presentation of statistics in the framework of 
the theory of measurement sets a new and sensible style 
for the elementary textbook. This fresh approach suc- 
ceeds in anchoring statistical procedures to the basic 
operations of measurement which determine what 
kinds of statistics are appropriate. Senders’ book keeps 
the student in close contact with the empirical realities 
of behavioral science.” 
S. S. Stevens, Harvard University 


Spring 1958 549 pages $6.00 


OXFORD UNIVERSITY PRESS, 417 Fifth Avenue, New York 16 
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Coming in April... 


INTRODUCTION TO PROBABILITY 
AND STATISTICS 


By B. W. LINDGREN, Assistant Professor of Mathe- 
matics, and G. W. McELRATH, Professor and Di- 
rector of the Division of Industrial Engineering; both, 
Institute of Technology, University of Minnesota. 


Designed especially for a short course for engineering 
and industrial students, this introductory text presents 
classical and modern statistical methods based on a 
preliminary treatment of the concept of probability. 
Although some background in calculus is presupposed, 
the book is not a “pure” mathematical approach 
to statistics. Numerous problems, illustrations and 
such modern techniques as non-parametric tests are 
also included. The last chapter contains a short intro- 
duction to fields for further study. 


Recently published .. . 
THE THEORY OF GROUPS 


By MARSHALL HALL, JR., Professor of Mathe- 
matics, The Ohio State University 


This book presents both the fundamentals of group 
theory and a broad selection from the most recent and 
active areas of research in this field. Included are the 
theory of free groups and free products, a lattice 
theoretical approach to properties of subgroup series, 
and a chapter on group representation. Original ma- 
terial on the Burnside problem and on projective 
planes is especially noteworthy. 

January 1959 


The Macmillan Company 


60 FIFTH AVENUE, NEW YORK 11,N. Y. 
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STATISTICIAN 


Detroit Research Laboratories has opening in Statistical 
Planning Group. Requires person with M.S. or Ph.D. degree 
in statistics and two to five years’ industrial experience in 
engineering, chemical or physical applications. Principal 
duties involve the planning and analysis of experimental 
work of a diverse nature encountered in our automotive and 
chemical research laboratories. The data are often charac- 
terized by high variability and high cost. Two digital com- 
puters are available to the group. For more particulars 


write to: 


Personnel Manager 
Ethyl Corporation 
Research Laboratories 
1600 W. Eight Mile Road 
Ferndale 20, Michigan 








..-A NEW TEXTBOOK ... 


BUSINESS 
STATISTICS 


By Dr. JOHN R. STOCKTON 
The University of Texas 


Here is a skillfully written book that stresses the value of statistical 
analysis in the solution of business problems. The mathematics needed 
by the student is explained as the principles of analysis and the formulas 
are presented. Many charts are used to illustrate the various uses of the 


graphic method. 

The problem material in BUSINESS STATISTICS is outstanding, both 
in quality and quantity. A workbook, containing twenty long problems, 
is available. 


SOUTH-WESTERN PUBLISHING CO. 


(Specialists in Business and Economic Education) 
Cincinnati 27 New Rochelle, N.Y. Chicago 5 San Francisco 3 Dallas 2 











Please mention the Journal of the American Statisticat Association in writing advertisers 








THE INCOME OF NATIONS: 


Theory, Measurement, and Analysis Past and Present 
A Study in Applied Economics and Statistics 


By PAUL STUDENSKI 


This definitive work is the most comprehensive study of the national 
income ever published. It gives a unified presentation on a world wide 
scale of the economic, statistical, political, and historical background of 
this important area of over-all economic analysis and planning. STA- 
TISTICAL APPENDIX: National Income or Product of 87 countries. 
Summary Table, Notes to Chapters, Indez. 


Diagrams, Exhibits, and 177 tables 576 pp. 714 x 10 $25.00 


HIGH PRAISE FOR THIS MONUMENTAL VOLUME— 


“A reference work of wide use for years to come.” : a. 
Simon Kuznets, Prof. of Political Economy, Johns Hopkins University 


“The book is exceptionally comprehensive and valuable.” 
Roy Blough, Prof. of International Business, Columbia University 


NEW YORK UNIVERSITY PRESS — Washington Square, New York 











A complete introduction to . . . | 


PRINCIPLES of STATISTICAL ANALYSIS 


SAMUEL B. RICHMOND, Columbia University 


This highly teachable textbook is designed as an introduction 
to statistical analysis for students of business and economics. 
Detailed illustrative material combines with the text to present 
a thorough treatment of the collection, analysis, and presenta- 
tion of statistical data. The book is organized around the modern 
concept of statistical induction. Mathematical procedures are 
kept to a minimum, and those techniques employed are intro- 
duced and explained at the point of use. A unique feature is the 
Glossary of Equations in which each equation in the text is listed, 
located, and explained. 210 ills., tables; 491 pp. $6.50 


@ “A good textbook for a beginning course.” 
Francis B. May, University of Texas, 
in Southwest Social Science Quarterly 


@ “The text is a solid, well-done book.” 
Harry Malisoff, Brooklyn College, 
in The Statistical Review 


THE RONALD PRESS COMPANY « 15 E. 26th St., New York 10 
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JUST PUBLISHED 


QUALITY CONTROL 
AND INDUSTRIAL STATISTICS 


REVISED EDITION 
By ACHESON J. DUNCAN, The Johns Hopkins University 


The Revised Edition of this outstanding basic text has been rewritten 
and brought up to date to reflect the latest developments in the field. 
Principal changes from the widely used previous edition include: 


@ Separation of the discussion of acceptance sampling from that of 
rectifying (AOQL) inspection 

e Addition of a special chapter on multiple sampling plans 

e Complete rewriting of the material on acceptance by variables 








e Enlargement of the section on continuous sampling. 


Some 400 questions and problems are provided, many of them new to 
this edition. A complete Solutions Manual is available to adopters. 


RICHARD D. IRWIN, INC. © HOMEWOOD, ILLINOIS 








1958 MEMBERSHIP DIRECTORY 
of the 


American Statistical Association 


The 1958 Membership Directory of the American Statistical Association is available for im- 
mediate shipment. This edition contains over 6,500 names. Information includes: member's 
name and title; business affiliation and address; other business affiliations, if any; degrees, 
with year granted and institution; fields of specialization, including methodological techniques 
and fields of application; major types of statistical activities; and sectional interest. 


In addition to the alphabetical listing, there is a complete geographical listing by city and state 
for the United States, city and province for Canada, and by country for the rest of the world. 
There is also a complete Noting by membership in the five Sections of the Association: Bio- 
metrics Section, Busi and ic Statistics Section, Section on Physical and Engineering 
Sciences, Social! Statistics esis and Section on the Training of Statisticians. 





The new Directory measures 8%” x 11” over-all, and contains 160 pages. 





Use the order form below to request your copies. Price: $4.50 per copy, if remi: 
order. An additional charge of $.50 will be made, per copy, on orders received po wa somietanes. 


PROER To: American Statistical Association, 1757 K St., N. W., Washington 6, D. C. 
Please send me copies of the 1958 Membership Directory, @ $4.50 per 
copy (remittance included herewith), Or bill me at the $5.00 rate and send invoice. 

C) Payment Enclosed 0 Bill Us 
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BIOMETRICS 


Journal of the Biometric Society 


Vol. 15, No. 1 CONTENTS March 1959 


Host Variability in Dilution Experiments Peter Armitage 
Restricted Selection Indices Oscar Kempthorne and Arne W. Nordskog 


Equilibria in Auto-Tetraploids Under Natural Selection for a Simplified Model of 
VONNOD 6 £:0::0556.05.0.0:006406600060uK COREL Added eb de Oe eeES P. A. Parsons 


The Analysis of a Non-Replicated Experiment Involving a Single Four-Course 
Rotation of Crops H. D. Patterson 


The Analysis of a Two Phase Experiment R. N. Curnow 
Analysis of Quadruple Rectangular Lattice Designs John Leroy Folks 


An Examination of Some Methods of Comparing Several Rates or Proportions 
Mindel C. Sheps 


The Analysis of Experiments on Growth Rate ........ F. B. Leech and M. J. R. Healy 


Calculation of Chi-Square to Test the No Three-Factor Interaction Hypothesis .... 
Marvin A. Kastenbaum and Donald E. Lamphiear 


Extra-Period Change-Over Designs H. D. Patterson and H. L. Lucas 
Queries and Notes 


Replication of Non-Center Points in the Rotatable and Near-Rotatable Central 
RINE TRIO fio c.c.050 0005 cnacsensicccckestsenasessbeeeeenad G. E. P. Box 


Significance of Difference Between Two Non-Independent Correlation Co- 
CII 66:5. 0 03 6Gddaee o005nsss.0h06bnunve tied ebhmns ate eee E. J. Williams 





Biometrics is published quarterly. Its objects are to describe and exemplify the use of 
mathematical and statistical methods in biological and related sciences in a form 
assimilable by experimenters. The annual non-member subscription rate is $7.00. In- 
quiries, orders for back issues, and non-member subscriptions should be addressed to: 


BIOMETRICS 
Department of Statistics 
Virginia Polytechnic Institute 
Blacksburg, Virginia 
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The Annals of Mathematical Statistics 


THE OFFICIAL JOURNAL OF THE INSTITUTE OF 
MATHEMATICAL STATISTICS 


Vol. 30, No. 1—March, 1959 


Contents 


Invariance Theory and a Modified Minimax Principle Oscar Wesler 
On Linear Associative Algebras Corresponding to araciogion Schemes of Portially 
Balanced Designs R. C. Bose and Dale M. Mesner 
On a Characterization of the Triangular Association Scheme ....... S. S. Shrikhande 
On a Generalisation of the Kronecker Product Designs B. V. Shah 
Application of a Measure of Information to the Design and Comparison of rg 
sion ents ° 
Note on Estimating Information Colin R. Biyth 
Yupiosed Sequential Estimation for Binomial Populations Morris H. DeGroot 
A Single-Sample Sie Hecisicn., Procedure for Selecting the Multinomial Event 
Which the Highest 
Robert ne echhofer, Salah Elmaghraby, and Norman Morse 
A Property of the Muitinomial Distribution Hi Kesten and Norman Morse 
3 Classification Problem Involving Multinomials Oscar Wesler 
A Problem in Optimum Filtering with Finite Data Gopinath Kallianpur 
On the Limiting Distribution of the Number of Coincidences Concerning Telephone 


ifferential Equation of Takacs. Ir 
S ymmetrizable Markov Matrices 
On Some Statistical Tests for Mth Order Markov Chains 
Consensus of Subjective Probabilities: The Pari-Mutuel Method 
Edmund Eisenberg and David Gale 
A Note on Perfect Probabilit Gopinath Kallianpur 
On the Distribution of the Kolmogorov-Smirnoy D-Statistic 
Pedro Egydio de Oliveira Carvalho 
Equality of More Than Two Variances and of More Than Two Dispersion Matrices 
of ge. Certain Alternatives ..... EEN IE R. Gnanadesikan 
On the Theory of BAN Estimates . Robert A, Wijsman 
Estimation of the Medians for Dependent Variables Olive Jean Dunn 
A Consistent Estimator of a Component of a Convolution William R. Gaffey 
Bayes and Minimax Procedures in Sampling from Finite and — ‘oo 


An Approximation Useful in Univariate Stratification 
Truncation and Tests of Hypotheses Om P. Aggarwal and Irwin Guttman 


Notes: 


Bartlett Decomposition and Wishart Distribution A. M, Kshirsagar 
nds on the Distribution Functions of the Largest and Smallest Roots o 
Normal Determinental Equations Ray Mickey 
Note on a Moving Single Server Problem 
. Karlin, R. G. Miller, Jr., and N. U. Prabhu 
Distribution of the “Blocks Adjusted for Treatments” Sum of Squares in Incom- 
plete a 


esigns A. 
A Series of Symmetrical Group pg oe Incomplete Block Designs ...D. Sprott 
Alternative Proof of a Theorem of Birnbaum and Pyke Nicolaas fi Kuiper 
i-Ranges of Samples from an Exponential Population Paul R. Rider 
cknowledgment of Priori Ralph ‘%. Stanton 
Correction to “Random Ort rthogonal Transformations and their Use in Some Clas- 
sical Distribution Problems in Multivariate Analysis’’ Robert A. Wijsman 


Abstracts of Papers 

News and Notices 

Report of the Monterey, California, Meeting 
Final Editorial Report 

Publications Received 





h £. 





Ps orders for subscriptions and back s to P: George E. Nichol- 
Jr., Secretary, Institute of Mathematical Statistics, Department of Statistics, 
Univiesley of North Carolina, Chapel Hill, North Carolina. 
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Announcing the Vew Journal 


TECHNOMETRI 


A Journal: of Statistics for the Physical, 
Chemical and Engineering Sciences 
Jointly sponsored by the American 
Society for Quality Control and the 

American Statistical Association 





TECHNOMETRICS is a new publication starting with the February, 1959 issue, dedicated 
to the development and meaningful use of statistical methods in the physical, 
chemical and engineering sciences. 

TECHNOMETRICS will be a journal designed to be read by the engineer, physical scientist 
and the statistician. 

TECHNOMETRICS will publish: 

) papers describing new statistical techniques expected to be useful in the physi- 

cal sciences; 

(2) papers illustrating the application of known statistical methods in new or 
novel environments; 

(3) expository or tutorial articles on particular statistical methods, 

(4) aan Seng with philosophy and problems of experimentation and quality 
control. 

Whenever possible, published papers are to contain numerical examples, 

TECHNOMETRICS will appear quarterly in February, May, August and November. 
(first and second issues of volume one will be delayed slightly) 





ARTICLES ACCEPTED FOR THE EARLY ISSUES INCLUDE: 


Response Surface Designs for Three Factors at Three Levels 
Some Statistical Aspects of the Economics of Analytical Testing .. 
A Quick Compact Two Sample Test to Duckworth’s Specifications . ‘J. W. TuKEY 
The Analysis of Life Test Data R. L. PLackettT 
Mathematical Probability in the Natural Science 
Partial Duplication of Factorial Experiments 
Calculations for Evolutionary Operation Program ..G. E. P. Box & J. S. HunTER 
Measurements Made by Matching With Known Standards 

W. J. Youpen, W. Connor & N. C. SEvERO 











Subscription Rates: 
Members of the ASA $6.00 a year 
Members of the ASQC 
Non-members, libraries, etc. .........02seeeeeeseeeee $8.00 a year 


ORDER FORM 


Please send 
(Name) 
(Address) 

I enclose $ 


Signature 
* All remittances to be made out to the order of TECHNOMETRICS and mailed to:— 


TECHNOMETRICS 


404 Beacon Bldg. © 1757KStreet,N.W. © Washington 6, D.C. 
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POPULATION STUDIES 


A Journal of Demography 
Edited by D. V. Grass and E. Gresenik 


Vol. XII, No. 2 CONTENTS November 1958 


C. E. V. LESER. Trends in Women’s Work Participation. 

BERTRAM HUTCHINSON. Structural and Exchange Mobility in the Assimilation of 
Immigrants to Brazil. 

H. HYRENIUS. Fertility and Reproduction in a Swedish Population Group without 
Family Limitation. 

W. S. HOCKING. A Method of Forecasting the Future Composition of the Population of 
Great Britain by Marital Status. 

N. H. CARRIER. A Note on the Estimation of Mortality and other Population Charac- 
teristics given Deaths by Age. 

K, 7 GABRIEL and ILANA RONEN. Estimates of Mortality from Infant Mortality 

ates. 


BOOK REVIEWS 


Subscription price per volume of 3 parts 42s. net, post free 
(or American curreney $6.75). 
Single parts £1, each plus postage 
(American $3.25, post free). 


Published by the POPULATION INVESTIGATION COMMITTEE, 
at the LONDON SCHOOL OF ECONOMICS AND POLITICAL SCIENCE, 
15 HOUGHTON STREET, LONDON, W.C.2. 











THE JOURNAL OF FINANCE 


Published by THE AMERICAN FINANCE ASSOCIATION 
Volume XIV, No. 1 (March, 1959) includes: 


ARTICLES 


Stock Market “‘Patterns” and Financial Analysis 
An Estimate of Bank-Administered Personal Trust Funds F 
Raymond W. Goldsmith and ~ Shapiro 
Theory of the Capital Structure of the Firm li Schwartz 
The Policy eee in the United States of Reliance on Automatic Fiscal oy 
rown 
. Parks 


COMMUNICATIONS 


Comment on “Puts and Calls: A Factual Survey” ichard J. Kruizenga 
ly Charles B. Franklin ond’ Marshall R. Colberg 


Re 
A Godin Note on Time Deposit Interest Rates George R. Morrison 


Membership dues, including $3.00 allocated to eubessiggion in ee Journal of Finance, 
are $5.00 annually. Libraries may enbesrihe te The Journal at $5.00 annually and 
single copies may be purchased for $1.25, Applications for soembership in the American 
Finance Association and subscriptions to 7 of Finance should be addressed 
to the Secretary-Treasurer, sonra E. fesones yo BS School of Business Admin- 
istration, New York University, 90 Trinity Place, New York 6, New York. 


Communication relating to . contents of The Journal of Finance should be addressed 
to the Editor, Joel Segall, School of Business, University of Chicago, Chic: 87, 
Illinois, or to the Associate Scaler, Carl A. Dauten, School of Business and blic 
Administration, Washington University, St. Louis 5, Missouri. 
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JOURNAL OF FARM ECONOMICS 
Published by 
THE AMERICAN FARM ECONOMIC ASSOCIATION 
Editor: Rosert L. Ciopivus 
University of Wisconsin, Madison, Wisconsin 
Volume XL November 1958 Number 4 





Estimation of Spatial Price Equilibrium Models ..... G. G. Judge and T. D. Wallace 
‘Price Mapping’ of Optimum Changes in Enterprises ..............seeeeeeeee 

ee eeccosseccccseoecces sce ekeseescccecces W. W. McPherson and J. E. Faris 
Changes in Cotton Acreage in the Southeast—Implications for Suen Functions 
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